SlideShare a Scribd company logo
GOAL-ORIENTED AND ONTOLOGY-DRIVEN REQUIREMENT ANALYSIS METHOD FOR EXTRACTION-TRANSFORMATION-LOADING (ETL) PROCESSES IN DATA WAREHOUSE SYSTEM By AZMAN TA’A (91161) SUPERVISORS Assoc. Prof. Dr. Norita Md. Norwawi Dr. Mohd Syazwan Abdullah   COLLEGE OF ART AND SCIENCES September 14, 2008
OUTLINE INTRODUCTION PROBLEM STATEMENT REQUIREMENT ANALYSIS FOR DW REQUIREMENT ANALYSIS FOR ETL PROCESSES REQUIREMENT ANALYSIS APPROACH TRANSFORMATIONAL ANALYSIS ONTOLOGY MODELING EXAMPLE: CASE STUDY AT UUM CONCLUSION
INTRODUCTION ETL consumes 70 – 80 % of DW development resources Success of ETL is depending on data integration and transformation that deal with semantic reconciliation
INTRODUCTION Problem in design and development of ETL processes: Complexity and hugeness of DW (Vassiliadis, 2000; Kimball & Caserta, 2004) Inefficiency of data loading (Chaudhuri & Dayal, 1997; Kimball & Ross, 2002) Data integration and transformation process (Rahm & Do, 2000; Halevy, 2005) Generating the data transformation mechanism (Kimball & Caserta, 2004; Alexiev et al., 2005)
PROBLEM STATEMENT Problems in ETL processes can be classified into:  1)  define and maintain the ETL specifications 2)  handle   semantic heterogeneity problems ETL specification  is an application-driven rather than data-driven – cause error-prone to maintain and platform dependent fashion Semantic heterogeneity  is the problems of data conflicts in integration and transformation However, the design of DW system should be based on proper requirement engineering process (Kimball & Caserta, 2004; Simitsis, 2004; Lujan-Mora, 2005; Rizzi et al., 2006)
PROBLEM STATEMENT Several efforts on ETL modeling and data integration approach have been suggested. However, these efforts not focus in resolving the heterogeneity problems at modeling level, which related to the user requirements. User requirements does not properly guiding the developer to design the ETL processes due to unmanageable the user interpretation in business requirements The ETL designer need a  proper method  to design the ETL processes with consideration of data heterogeneity problems, and generation of ETL specifications from the early phase of DW development
REQUIREMENT ANALYSIS FOR DW Requirement Perspective of DW
REQUIREMENT ANALYSIS FOR DW DW requirements aim to identify the decisional information Generally, DW requirements approach: 1) process-driven (Kimball, 1996) 2) supply/data-driven (Inmon, 2002) 3) demand/user-driven (Winter & Strauch, 2003) DW requirements focus only on the data or information-centric – both supply and demand are relevant to adopt Moreover, demand/supply-driven compliment each other to support complex requirements
REQUIREMENT ANALYSIS FOR ETL PROCESSES Transformation of informal statements of user requirements into formal expression of ETL specification. User requirements elicited and analyzed from the organization and decision-maker perspective. These requirements will be mapped with the available data sources through ETL processes, which is should be derived from transformation analysis Thus, proper and systematic transformation analysis is required in the early phase of DW development
REQUIREMENT ANALYSIS FOR ETL PROCESSES General Requirement Analysis Model  (Adapted from Prakash and Gosain (2003))
REQUIREMENT ANALYSIS FOR ETL PROCESSES Requirement analysis approach centered on the organizational and decisional modeling, and focus on transformation analysis. Organization modeling – to identify goal that related to DW components such as facts and attributes of organization –  as is  analysis Decision modeling – to identify goal that related to decision maker to DW components such as facts, dimension, measures, and transformation –  to be  analysis Requirement analysis method is based on Goal-Oriented methodology
REQUIREMENT ANALYSIS FOR ETL PROCESSES Detail Requirement Analysis Model  (Adapted from Giorgini  et al . (2008))
REQUIREMENT ANALYSIS APPROACH The approach using Tropos methodology, which based on i* conceptual framework of software development (Yu, 1995; Bresciani  et al ., 2003)  Founded on the agent-oriented software development methodogy, which using agent and related mentalistic notion in all phases of software development Importantly, Tropos support an early requirement analysis to the implementation, which essentially explain how the intended system (i.e. DW system) will meet organization goals.
REQUIREMENT ANALYSIS APPROACH The Tropos methodology is consists of five main phases: early requirements, late requirements, architectural design, detailed design, and implementation. Introduce the concepts of Actor, Goal, Plan, Resource, Dependency, Capability, Belief. Modeling activities are Actor modeling, Dependency modeling, Goal modeling, Plan modeling, Capability modeling Apply three basic reasoning techniques: means-end analysis, contribution analysis, and AND/OR decomposition
TRANSFORMATIONAL ANALYSIS Is not supported in previous requirement analysis approach. However, the analysis should perform from an early phase of requirement engineering to ensure meeting the organization goal. In Tropos, transformation analysis deal with  Plan modeling , which is compliment to the decision-goal modeling. Plan modeling will determine set of transformations and constraint activities as required.
ONTOLOGY MODELING Ontology approach is used to model two sources of information prior to the conceptual design of ETL processes. First source: modeling the glossaries of DW terms (i.e. facts, attributes, dimensions, measures, actions, constraints) produced by requirement analysis process Second source: modeling the data sources that related to subject area or application of DW system.  The tasks of ontology construction, mapping, ETL specification construction will establish before and during the conceptual design of ETL processes.
EXAMPLE: CASE STUDY IN UUM University Goals
EXAMPLE: CASE STUDY IN UUM Actor Diagram for UUM example
EXAMPLE: CASE STUDY IN UUM Rationale Diagram for UUM actor from organizational perspective
EXAMPLE: CASE STUDY IN UUM Extended Rationale Diagram for UUM actor from organizational perspective
EXAMPLE: CASE STUDY IN UUM Rationale Diagram for AAD Director from decisional perspective
EXAMPLE: CASE STUDY IN UUM Extended Rationale Diagram for AAD Director
CONCEPTUAL  MODEL OF ETL PROCESSES
LOGICAL MODEL OF ETL PROCESSES
CONCLUSION The aim of this study is to model and design the ETL processes in DW system from requirement analysis tasks to ETL processes implementation. The use of Goal-Oriented approach to analyst user requirements and Ontology-based approach to model requirement glossaries and data sources hoping to resolve these problems mentioned. Works still in progress - concentrate on UUM case study to pre-confirm the model and methods proposed.  Next step is to implement the solutions in different domain and bigger scale.
PAPERS SAS Forum Malaysia –  Modeling BI in Academic Information Portal, SAS Kuala Lumpur, 5 S eptember 2007.  SUGI08 –  Academic Business Intelligence Development Using SAS Tools , San Antonio, Texas, USA. 15-19 March 2008 – Paper accepted and selected as SAS Student Ambassador.  Camp08 – Ontology-Based Extraction-Transformation-Loading (ETL) Processes Model in Data Warehouse Environments, Kuala Lumpur. 18 March, 2008 – Paper submitted.
THANK YOU Q & A [email_address]

More Related Content

PPTX
Operation research unit1 introduction and lpp graphical and simplex method
PPT
Simulating Enterprise Architecture Models
PPTX
applications of operation research in business
PPTX
Resource management techniques
PPTX
Operation Research
PDF
An application of genetic algorithms to time cost-quality trade-off in constr...
PPTX
Operations Research
PPTX
Models of Operational research, Advantages & disadvantages of Operational res...
Operation research unit1 introduction and lpp graphical and simplex method
Simulating Enterprise Architecture Models
applications of operation research in business
Resource management techniques
Operation Research
An application of genetic algorithms to time cost-quality trade-off in constr...
Operations Research
Models of Operational research, Advantages & disadvantages of Operational res...

What's hot (20)

PDF
INVESTIGATING HUMAN-MACHINE INTERFACES’ EFFICIENCY IN INDUSTRIAL MACHINERY AN...
PPTX
Operations Research - Models
PPTX
Operation research techniques
PDF
10.1.1.64.430
PDF
Success Factors for Enterprise Systems in the Higher Education Sector: A Case...
PDF
Using Model-Driven Engineering for Decision Support Systems Modelling, Implem...
PPTX
Models of Operations Research is addressed
PPTX
Lecture 1 introduction to or
PDF
Development of an Interactive Simulation of Steel Cord Manufacturing for Indu...
PDF
Performance Evaluation using Blackboard Technique in Software Architecture
PDF
Using Multi-Criteria Decision and knowledge representation methodologies for ...
PDF
Target-based test path prioritization for UML activity diagram using weight a...
PDF
Introduction to Systems Engineering
PDF
Availability Assessment of Software Systems Architecture Using Formal Models
PDF
130411 francis palma - detection of process antipatterns -- a bpel perspective
PPT
Operation research in Statistic
PDF
Management science
PDF
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
PPT
Designing systems for managing dynamic collaborative research processes
PDF
Wcre13a.ppt
INVESTIGATING HUMAN-MACHINE INTERFACES’ EFFICIENCY IN INDUSTRIAL MACHINERY AN...
Operations Research - Models
Operation research techniques
10.1.1.64.430
Success Factors for Enterprise Systems in the Higher Education Sector: A Case...
Using Model-Driven Engineering for Decision Support Systems Modelling, Implem...
Models of Operations Research is addressed
Lecture 1 introduction to or
Development of an Interactive Simulation of Steel Cord Manufacturing for Indu...
Performance Evaluation using Blackboard Technique in Software Architecture
Using Multi-Criteria Decision and knowledge representation methodologies for ...
Target-based test path prioritization for UML activity diagram using weight a...
Introduction to Systems Engineering
Availability Assessment of Software Systems Architecture Using Formal Models
130411 francis palma - detection of process antipatterns -- a bpel perspective
Operation research in Statistic
Management science
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
Designing systems for managing dynamic collaborative research processes
Wcre13a.ppt
Ad

Viewers also liked (7)

PPT
Biehl (2009) ODiSe Discussion Slides
PPTX
Ontology Engineering for Big Data
PPTX
Semantic Web, Ontology, and Ontology Learning: Introduction
PDF
Introduction to Ontology Concepts and Terminology
PPT
The Role Of Ontology In Modern Expert Systems Dallas 2008
PPTX
Ontology
Biehl (2009) ODiSe Discussion Slides
Ontology Engineering for Big Data
Semantic Web, Ontology, and Ontology Learning: Introduction
Introduction to Ontology Concepts and Terminology
The Role Of Ontology In Modern Expert Systems Dallas 2008
Ontology
Ad

Similar to Ph D Progress 14 09 2008 (20)

PPT
Planning Data Warehouse
DOCX
Etl techniques
PDF
Data_Warehouse_Methodology_A_Process_Driven_Approa.pdf
PDF
A Comparitive Study Of ETL Tools
PPT
Datawarehousing & DSS
PDF
Design and implementation of the web (extract, transform, load) process in da...
DOCX
Abdul ETL Resume
PPT
Situation Awareness In A Complex World
PDF
6. ijece guideforauthors 2012_2 eidt sat
PPTX
Requirements analysis 2011
PDF
An ERP Implementation Method Studying A Pharmaceutical Company
PDF
Process driven software development methodology for enterprise information sy...
DOC
Informatica_Power_Centre_9x
PPS
Data Warehouse 102
PDF
Etl design document
PPT
Data Warehouse
PPT
Chapter 2-data-warehousingppt2517 vero
DOCX
sharon - cv
PPTX
What is ETL?
DOCX
Business Intelligence, Analytics, and Data Science A Managerial
Planning Data Warehouse
Etl techniques
Data_Warehouse_Methodology_A_Process_Driven_Approa.pdf
A Comparitive Study Of ETL Tools
Datawarehousing & DSS
Design and implementation of the web (extract, transform, load) process in da...
Abdul ETL Resume
Situation Awareness In A Complex World
6. ijece guideforauthors 2012_2 eidt sat
Requirements analysis 2011
An ERP Implementation Method Studying A Pharmaceutical Company
Process driven software development methodology for enterprise information sy...
Informatica_Power_Centre_9x
Data Warehouse 102
Etl design document
Data Warehouse
Chapter 2-data-warehousingppt2517 vero
sharon - cv
What is ETL?
Business Intelligence, Analytics, and Data Science A Managerial

Recently uploaded (20)

PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Chapter 5: Probability Theory and Statistics
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
1 - Historical Antecedents, Social Consideration.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Approach and Philosophy of On baking technology
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
1. Introduction to Computer Programming.pptx
PDF
Getting Started with Data Integration: FME Form 101
PDF
project resource management chapter-09.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
August Patch Tuesday
gpt5_lecture_notes_comprehensive_20250812015547.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Enhancing emotion recognition model for a student engagement use case through...
NewMind AI Weekly Chronicles - August'25-Week II
Chapter 5: Probability Theory and Statistics
A comparative analysis of optical character recognition models for extracting...
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Group 1 Presentation -Planning and Decision Making .pptx
OMC Textile Division Presentation 2021.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
1 - Historical Antecedents, Social Consideration.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Approach and Philosophy of On baking technology
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
1. Introduction to Computer Programming.pptx
Getting Started with Data Integration: FME Form 101
project resource management chapter-09.pdf
Encapsulation_ Review paper, used for researhc scholars
August Patch Tuesday

Ph D Progress 14 09 2008

  • 1. GOAL-ORIENTED AND ONTOLOGY-DRIVEN REQUIREMENT ANALYSIS METHOD FOR EXTRACTION-TRANSFORMATION-LOADING (ETL) PROCESSES IN DATA WAREHOUSE SYSTEM By AZMAN TA’A (91161) SUPERVISORS Assoc. Prof. Dr. Norita Md. Norwawi Dr. Mohd Syazwan Abdullah COLLEGE OF ART AND SCIENCES September 14, 2008
  • 2. OUTLINE INTRODUCTION PROBLEM STATEMENT REQUIREMENT ANALYSIS FOR DW REQUIREMENT ANALYSIS FOR ETL PROCESSES REQUIREMENT ANALYSIS APPROACH TRANSFORMATIONAL ANALYSIS ONTOLOGY MODELING EXAMPLE: CASE STUDY AT UUM CONCLUSION
  • 3. INTRODUCTION ETL consumes 70 – 80 % of DW development resources Success of ETL is depending on data integration and transformation that deal with semantic reconciliation
  • 4. INTRODUCTION Problem in design and development of ETL processes: Complexity and hugeness of DW (Vassiliadis, 2000; Kimball & Caserta, 2004) Inefficiency of data loading (Chaudhuri & Dayal, 1997; Kimball & Ross, 2002) Data integration and transformation process (Rahm & Do, 2000; Halevy, 2005) Generating the data transformation mechanism (Kimball & Caserta, 2004; Alexiev et al., 2005)
  • 5. PROBLEM STATEMENT Problems in ETL processes can be classified into: 1) define and maintain the ETL specifications 2) handle semantic heterogeneity problems ETL specification is an application-driven rather than data-driven – cause error-prone to maintain and platform dependent fashion Semantic heterogeneity is the problems of data conflicts in integration and transformation However, the design of DW system should be based on proper requirement engineering process (Kimball & Caserta, 2004; Simitsis, 2004; Lujan-Mora, 2005; Rizzi et al., 2006)
  • 6. PROBLEM STATEMENT Several efforts on ETL modeling and data integration approach have been suggested. However, these efforts not focus in resolving the heterogeneity problems at modeling level, which related to the user requirements. User requirements does not properly guiding the developer to design the ETL processes due to unmanageable the user interpretation in business requirements The ETL designer need a proper method to design the ETL processes with consideration of data heterogeneity problems, and generation of ETL specifications from the early phase of DW development
  • 7. REQUIREMENT ANALYSIS FOR DW Requirement Perspective of DW
  • 8. REQUIREMENT ANALYSIS FOR DW DW requirements aim to identify the decisional information Generally, DW requirements approach: 1) process-driven (Kimball, 1996) 2) supply/data-driven (Inmon, 2002) 3) demand/user-driven (Winter & Strauch, 2003) DW requirements focus only on the data or information-centric – both supply and demand are relevant to adopt Moreover, demand/supply-driven compliment each other to support complex requirements
  • 9. REQUIREMENT ANALYSIS FOR ETL PROCESSES Transformation of informal statements of user requirements into formal expression of ETL specification. User requirements elicited and analyzed from the organization and decision-maker perspective. These requirements will be mapped with the available data sources through ETL processes, which is should be derived from transformation analysis Thus, proper and systematic transformation analysis is required in the early phase of DW development
  • 10. REQUIREMENT ANALYSIS FOR ETL PROCESSES General Requirement Analysis Model (Adapted from Prakash and Gosain (2003))
  • 11. REQUIREMENT ANALYSIS FOR ETL PROCESSES Requirement analysis approach centered on the organizational and decisional modeling, and focus on transformation analysis. Organization modeling – to identify goal that related to DW components such as facts and attributes of organization – as is analysis Decision modeling – to identify goal that related to decision maker to DW components such as facts, dimension, measures, and transformation – to be analysis Requirement analysis method is based on Goal-Oriented methodology
  • 12. REQUIREMENT ANALYSIS FOR ETL PROCESSES Detail Requirement Analysis Model (Adapted from Giorgini et al . (2008))
  • 13. REQUIREMENT ANALYSIS APPROACH The approach using Tropos methodology, which based on i* conceptual framework of software development (Yu, 1995; Bresciani et al ., 2003) Founded on the agent-oriented software development methodogy, which using agent and related mentalistic notion in all phases of software development Importantly, Tropos support an early requirement analysis to the implementation, which essentially explain how the intended system (i.e. DW system) will meet organization goals.
  • 14. REQUIREMENT ANALYSIS APPROACH The Tropos methodology is consists of five main phases: early requirements, late requirements, architectural design, detailed design, and implementation. Introduce the concepts of Actor, Goal, Plan, Resource, Dependency, Capability, Belief. Modeling activities are Actor modeling, Dependency modeling, Goal modeling, Plan modeling, Capability modeling Apply three basic reasoning techniques: means-end analysis, contribution analysis, and AND/OR decomposition
  • 15. TRANSFORMATIONAL ANALYSIS Is not supported in previous requirement analysis approach. However, the analysis should perform from an early phase of requirement engineering to ensure meeting the organization goal. In Tropos, transformation analysis deal with Plan modeling , which is compliment to the decision-goal modeling. Plan modeling will determine set of transformations and constraint activities as required.
  • 16. ONTOLOGY MODELING Ontology approach is used to model two sources of information prior to the conceptual design of ETL processes. First source: modeling the glossaries of DW terms (i.e. facts, attributes, dimensions, measures, actions, constraints) produced by requirement analysis process Second source: modeling the data sources that related to subject area or application of DW system. The tasks of ontology construction, mapping, ETL specification construction will establish before and during the conceptual design of ETL processes.
  • 17. EXAMPLE: CASE STUDY IN UUM University Goals
  • 18. EXAMPLE: CASE STUDY IN UUM Actor Diagram for UUM example
  • 19. EXAMPLE: CASE STUDY IN UUM Rationale Diagram for UUM actor from organizational perspective
  • 20. EXAMPLE: CASE STUDY IN UUM Extended Rationale Diagram for UUM actor from organizational perspective
  • 21. EXAMPLE: CASE STUDY IN UUM Rationale Diagram for AAD Director from decisional perspective
  • 22. EXAMPLE: CASE STUDY IN UUM Extended Rationale Diagram for AAD Director
  • 23. CONCEPTUAL MODEL OF ETL PROCESSES
  • 24. LOGICAL MODEL OF ETL PROCESSES
  • 25. CONCLUSION The aim of this study is to model and design the ETL processes in DW system from requirement analysis tasks to ETL processes implementation. The use of Goal-Oriented approach to analyst user requirements and Ontology-based approach to model requirement glossaries and data sources hoping to resolve these problems mentioned. Works still in progress - concentrate on UUM case study to pre-confirm the model and methods proposed. Next step is to implement the solutions in different domain and bigger scale.
  • 26. PAPERS SAS Forum Malaysia – Modeling BI in Academic Information Portal, SAS Kuala Lumpur, 5 S eptember 2007. SUGI08 – Academic Business Intelligence Development Using SAS Tools , San Antonio, Texas, USA. 15-19 March 2008 – Paper accepted and selected as SAS Student Ambassador. Camp08 – Ontology-Based Extraction-Transformation-Loading (ETL) Processes Model in Data Warehouse Environments, Kuala Lumpur. 18 March, 2008 – Paper submitted.
  • 27. THANK YOU Q & A [email_address]