SlideShare a Scribd company logo
11
Data WarehousingData Warehousing
Lecture-14Lecture-14
Process of Dimensional ModelingProcess of Dimensional Modeling
Virtual University of PakistanVirtual University of Pakistan
Ahsan Abdullah
Assoc. Prof. & Head
Center for Agro-Informatics Research
www.nu.edu.pk/cairindex.asp
National University of Computers & Emerging Sciences, Islamabad
Email: ahsan@cluxing.com
2
Process of Dimensional ModelingProcess of Dimensional Modeling
3
The Process of Dimensional ModelingThe Process of Dimensional Modeling
Four Step Method from ER to DMFour Step Method from ER to DM
1.1. Choose the Business ProcessChoose the Business Process
2.2. Choose the GrainChoose the Grain
3.3. Choose the FactsChoose the Facts
4.4. Choose the DimensionsChoose the Dimensions
4
Step-1: Choose the Business ProcessStep-1: Choose the Business Process
 A business process is a major operationalA business process is a major operational
process in an organization.process in an organization.
 Typically supported by a legacy systemTypically supported by a legacy system
(database) or an OLTP.(database) or an OLTP.
 Examples: Orders, Invoices, Inventory etc.Examples: Orders, Invoices, Inventory etc.
 Business Processes are often termed asBusiness Processes are often termed as
Data Marts and that is why many peopleData Marts and that is why many people
criticize DM as being data mart oriented.criticize DM as being data mart oriented.
5
Star-1
Star-2
Snow-flake
Step-1: Separating the ProcessStep-1: Separating the Process
6
Step-2: Choosing the GrainStep-2: Choosing the Grain
 Grain is the fundamental, atomic level of data to beGrain is the fundamental, atomic level of data to be
represented.represented.
 Grain is also termed as the unit of analyses.Grain is also termed as the unit of analyses.
 Example grain statementsExample grain statements
 Typical grainsTypical grains
 Individual TransactionsIndividual Transactions
 Daily aggregates (snapshots)Daily aggregates (snapshots)
 Monthly aggregatesMonthly aggregates
 Relationship between grain and expressiveness.Relationship between grain and expressiveness.
 Grain vs. hardware trade-off.Grain vs. hardware trade-off.
7
Step-2: Relationship b/w GrainStep-2: Relationship b/w Grain
Daily aggregates
6 x 4 = 24 values
Four aggregates per week
4 x 4 = 16 values
Two aggregates per week
2 x 4 = 8 values
LOW Granularity HIGH Granularity
8
The caseThe case FORFOR data aggregationdata aggregation
 Works well for repetitive queries.Works well for repetitive queries.
 Follows the known thought process.Follows the known thought process.
 Justifiable if used for max number of queries.Justifiable if used for max number of queries.
 Provides a “big picture” or macroscopic view.Provides a “big picture” or macroscopic view.
 Application dependent, usually inflexible toApplication dependent, usually inflexible to
business changes (remember lack ofbusiness changes (remember lack of
absoluteness of conventions).absoluteness of conventions).
9
The caseThe case AGAINSTAGAINST data aggregationdata aggregation
Aggregation is irreversible.Aggregation is irreversible.
 Can create monthly sales data from weekly salesCan create monthly sales data from weekly sales
data, but the reverse is not possible.data, but the reverse is not possible.
Aggregation limits the questions that can beAggregation limits the questions that can be
answered.answered.
 What,What, whenwhen, why,, why, wherewhere, what-else, what-next, what-else, what-next
10
The caseThe case AGAINSTAGAINST data aggregationdata aggregation
Aggregation can hide crucial facts.Aggregation can hide crucial facts.
The average of 100 & 100 is same as 150 & 50The average of 100 & 100 is same as 150 & 50
11
Aggregation hides crucial factsAggregation hides crucial facts
ExampleExample
Week-1 Week-2 Week-3 Week-4 Average
Zone-1 100 100 100 100 100
Zone-2 50 100 150 100 100
Zone-3 50 100 100 150 100
Zone-4 200 100 50 50 100
Average 100 100 100 100
Just looking at the averages i.e. aggregate
12
Aggregation hides crucial factsAggregation hides crucial facts
chartchart
0
50
100
150
200
250
Week-1 Week-2 Week-3 Week-4
Z1 Z2 Z3 Z4
Z1: Sale is constant (need to work on it)
Z2: Sale went up, then fell (need of concern)
Z3: Sale is on the rise, why?
Z4: Sale dropped sharply, need to look deeply.
W2: Static sale
13
“We need monthly sales
volume and Rs. by
week, product and Zone”
Facts
Dimensions
Step 3: Choose Facts statementStep 3: Choose Facts statement
14
 Choose theChoose the factsfacts that will populatethat will populate
each fact table record.each fact table record.
 Remember that best Facts are Numeric,Remember that best Facts are Numeric,
Continuously Valued and Additive.Continuously Valued and Additive.
 Example: Quantity Sold, Amount etc.Example: Quantity Sold, Amount etc.
Step 3: Choose FactsStep 3: Choose Facts
15
 Choose theChoose the dimensionsdimensions that apply tothat apply to
each fact in the fact table.each fact in the fact table.
 Typical dimensions: time, product,Typical dimensions: time, product,
geography etc.geography etc.
 Identify the descriptive attributes thatIdentify the descriptive attributes that
explain each dimension.explain each dimension.
 Determine hierarchies within eachDetermine hierarchies within each
dimension.dimension.
Step 4: Choose DimensionsStep 4: Choose Dimensions
16
Step-4: How to Identify a Dimension?Step-4: How to Identify a Dimension?
 The single valued attributesThe single valued attributes during recording of aduring recording of a
transactiontransaction are dimensions.are dimensions.
Calendar_Date
Time_of_Day
Account _No
ATM_Location
Transaction_Type
Transaction_RsTransaction_Rs
Fact Table
Dim
Time_of_day:Time_of_day: Morning, Mid Morning, Lunch Break etc.
Transaction_Type:Transaction_Type: Withdrawal, Deposit, Check balance etc.
17
Step-4: Can Dimensions be Multi-valued?Step-4: Can Dimensions be Multi-valued?
 Are dimensions ALWYS single?Are dimensions ALWYS single?
 Not reallyNot really
 What are the problems? And how to handle themWhat are the problems? And how to handle them
 Calendar_Date (of inspection)
 Reg_No
 Technician
 Workshop
 Maintenance_Operation
 How many maintenance operations are possible?How many maintenance operations are possible?
 FewFew
 Maybe more for old cars.Maybe more for old cars.
18
Step-4: Dimensions & GrainStep-4: Dimensions & Grain
 Several grains are possible as per businessSeveral grains are possible as per business
requirement.requirement.
 For some aggregations certain descriptions do notFor some aggregations certain descriptions do not
remain atomic.remain atomic.
 Example: Time_of_Day may change several timesExample: Time_of_Day may change several times
during daily aggregate, but not during a transactionduring daily aggregate, but not during a transaction
 Choose the dimensions that are applicableChoose the dimensions that are applicable
within the selected grain.within the selected grain.

More Related Content

PPTX
Dimensional Modeling
PPT
Lecture 3F.ppt
PPT
Dwh lecture 13-process dm
PPT
Dwh lecture slides-week10
PPT
Dimensional Modeling For engineering drawings.ppt
PDF
Data Warehouse Back to Basics: Dimensional Modeling
PPTX
Dimensional Modeling
PPT
Dwh lecture slides-week 13
Dimensional Modeling
Lecture 3F.ppt
Dwh lecture 13-process dm
Dwh lecture slides-week10
Dimensional Modeling For engineering drawings.ppt
Data Warehouse Back to Basics: Dimensional Modeling
Dimensional Modeling
Dwh lecture slides-week 13

Similar to Lecture 14 (20)

PDF
First Steps to Define Grain
PPTX
Lecture 08B - Logical-DWH-Model-Pending.pptx
PPT
Modelado Dimensional 4 etapas.ppt
PPT
Intro to Data warehousing lecture 08
PPT
It bi retail
PPT
BI in Retail sector
PPT
Lecture 15
PDF
DWHdatawarehouseconceptlearningdbdwh.pdf
PPTX
Introduction to Dimesional Modelling
PPTX
Data modeling trends for Analytics
ODP
04 Dimensional Analysis - v6
PPT
Data Mining Presentation on Science Day 2023
PPT
Dimensional Modeling Concepts_Nishant.ppt
PPTX
The Data Warehouse Lifecycle
PPTX
IT301-Datawarehousing (1) and its sub topics.pptx
ODP
Dimensional Modelling
PDF
Business intelligence an Overview
PPTX
Dataware house introduction by InformaticaTrainingClasses
PPTX
Data warehousing and mining furc
PDF
Data Warehouse Design & Dimensional Modeling
First Steps to Define Grain
Lecture 08B - Logical-DWH-Model-Pending.pptx
Modelado Dimensional 4 etapas.ppt
Intro to Data warehousing lecture 08
It bi retail
BI in Retail sector
Lecture 15
DWHdatawarehouseconceptlearningdbdwh.pdf
Introduction to Dimesional Modelling
Data modeling trends for Analytics
04 Dimensional Analysis - v6
Data Mining Presentation on Science Day 2023
Dimensional Modeling Concepts_Nishant.ppt
The Data Warehouse Lifecycle
IT301-Datawarehousing (1) and its sub topics.pptx
Dimensional Modelling
Business intelligence an Overview
Dataware house introduction by InformaticaTrainingClasses
Data warehousing and mining furc
Data Warehouse Design & Dimensional Modeling
Ad

More from Shani729 (20)

PPT
Python tutorialfeb152012
PPT
Python tutorial
PDF
Interaction design _beyond_human_computer_interaction
PPTX
Fm lecturer 13(final)
PPT
Lecture slides week14-15
PPT
Frequent itemset mining using pattern growth method
PPT
Dwh lecture slides-week15
PPT
Dwh lecture slidesweek7&8
PPT
Dwh lecture slides-week5&6
PPT
Dwh lecture slides-week3&4
PPT
Dwh lecture slides-week2
PPTX
Dwh lecture slides-week1
PPT
Dwh lecture slides-week 12&13
PPT
Lecture 40
PPT
Lecture 39
PPT
Lecture 38
PPT
Lecture 37
PPT
Lecture 35
PPT
Lecture 36
PPT
Lecture 34
Python tutorialfeb152012
Python tutorial
Interaction design _beyond_human_computer_interaction
Fm lecturer 13(final)
Lecture slides week14-15
Frequent itemset mining using pattern growth method
Dwh lecture slides-week15
Dwh lecture slidesweek7&8
Dwh lecture slides-week5&6
Dwh lecture slides-week3&4
Dwh lecture slides-week2
Dwh lecture slides-week1
Dwh lecture slides-week 12&13
Lecture 40
Lecture 39
Lecture 38
Lecture 37
Lecture 35
Lecture 36
Lecture 34
Ad

Recently uploaded (20)

PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Construction Project Organization Group 2.pptx
PPTX
Current and future trends in Computer Vision.pptx
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PDF
III.4.1.2_The_Space_Environment.p pdffdf
DOCX
573137875-Attendance-Management-System-original
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Artificial Intelligence
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
737-MAX_SRG.pdf student reference guides
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPT
introduction to datamining and warehousing
PPTX
Fundamentals of Mechanical Engineering.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Geodesy 1.pptx...............................................
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
CYBER-CRIMES AND SECURITY A guide to understanding
Construction Project Organization Group 2.pptx
Current and future trends in Computer Vision.pptx
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
III.4.1.2_The_Space_Environment.p pdffdf
573137875-Attendance-Management-System-original
Internet of Things (IOT) - A guide to understanding
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Artificial Intelligence
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
737-MAX_SRG.pdf student reference guides
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
introduction to datamining and warehousing
Fundamentals of Mechanical Engineering.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Geodesy 1.pptx...............................................
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx

Lecture 14

  • 1. 11 Data WarehousingData Warehousing Lecture-14Lecture-14 Process of Dimensional ModelingProcess of Dimensional Modeling Virtual University of PakistanVirtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research www.nu.edu.pk/cairindex.asp National University of Computers & Emerging Sciences, Islamabad Email: ahsan@cluxing.com
  • 2. 2 Process of Dimensional ModelingProcess of Dimensional Modeling
  • 3. 3 The Process of Dimensional ModelingThe Process of Dimensional Modeling Four Step Method from ER to DMFour Step Method from ER to DM 1.1. Choose the Business ProcessChoose the Business Process 2.2. Choose the GrainChoose the Grain 3.3. Choose the FactsChoose the Facts 4.4. Choose the DimensionsChoose the Dimensions
  • 4. 4 Step-1: Choose the Business ProcessStep-1: Choose the Business Process  A business process is a major operationalA business process is a major operational process in an organization.process in an organization.  Typically supported by a legacy systemTypically supported by a legacy system (database) or an OLTP.(database) or an OLTP.  Examples: Orders, Invoices, Inventory etc.Examples: Orders, Invoices, Inventory etc.  Business Processes are often termed asBusiness Processes are often termed as Data Marts and that is why many peopleData Marts and that is why many people criticize DM as being data mart oriented.criticize DM as being data mart oriented.
  • 5. 5 Star-1 Star-2 Snow-flake Step-1: Separating the ProcessStep-1: Separating the Process
  • 6. 6 Step-2: Choosing the GrainStep-2: Choosing the Grain  Grain is the fundamental, atomic level of data to beGrain is the fundamental, atomic level of data to be represented.represented.  Grain is also termed as the unit of analyses.Grain is also termed as the unit of analyses.  Example grain statementsExample grain statements  Typical grainsTypical grains  Individual TransactionsIndividual Transactions  Daily aggregates (snapshots)Daily aggregates (snapshots)  Monthly aggregatesMonthly aggregates  Relationship between grain and expressiveness.Relationship between grain and expressiveness.  Grain vs. hardware trade-off.Grain vs. hardware trade-off.
  • 7. 7 Step-2: Relationship b/w GrainStep-2: Relationship b/w Grain Daily aggregates 6 x 4 = 24 values Four aggregates per week 4 x 4 = 16 values Two aggregates per week 2 x 4 = 8 values LOW Granularity HIGH Granularity
  • 8. 8 The caseThe case FORFOR data aggregationdata aggregation  Works well for repetitive queries.Works well for repetitive queries.  Follows the known thought process.Follows the known thought process.  Justifiable if used for max number of queries.Justifiable if used for max number of queries.  Provides a “big picture” or macroscopic view.Provides a “big picture” or macroscopic view.  Application dependent, usually inflexible toApplication dependent, usually inflexible to business changes (remember lack ofbusiness changes (remember lack of absoluteness of conventions).absoluteness of conventions).
  • 9. 9 The caseThe case AGAINSTAGAINST data aggregationdata aggregation Aggregation is irreversible.Aggregation is irreversible.  Can create monthly sales data from weekly salesCan create monthly sales data from weekly sales data, but the reverse is not possible.data, but the reverse is not possible. Aggregation limits the questions that can beAggregation limits the questions that can be answered.answered.  What,What, whenwhen, why,, why, wherewhere, what-else, what-next, what-else, what-next
  • 10. 10 The caseThe case AGAINSTAGAINST data aggregationdata aggregation Aggregation can hide crucial facts.Aggregation can hide crucial facts. The average of 100 & 100 is same as 150 & 50The average of 100 & 100 is same as 150 & 50
  • 11. 11 Aggregation hides crucial factsAggregation hides crucial facts ExampleExample Week-1 Week-2 Week-3 Week-4 Average Zone-1 100 100 100 100 100 Zone-2 50 100 150 100 100 Zone-3 50 100 100 150 100 Zone-4 200 100 50 50 100 Average 100 100 100 100 Just looking at the averages i.e. aggregate
  • 12. 12 Aggregation hides crucial factsAggregation hides crucial facts chartchart 0 50 100 150 200 250 Week-1 Week-2 Week-3 Week-4 Z1 Z2 Z3 Z4 Z1: Sale is constant (need to work on it) Z2: Sale went up, then fell (need of concern) Z3: Sale is on the rise, why? Z4: Sale dropped sharply, need to look deeply. W2: Static sale
  • 13. 13 “We need monthly sales volume and Rs. by week, product and Zone” Facts Dimensions Step 3: Choose Facts statementStep 3: Choose Facts statement
  • 14. 14  Choose theChoose the factsfacts that will populatethat will populate each fact table record.each fact table record.  Remember that best Facts are Numeric,Remember that best Facts are Numeric, Continuously Valued and Additive.Continuously Valued and Additive.  Example: Quantity Sold, Amount etc.Example: Quantity Sold, Amount etc. Step 3: Choose FactsStep 3: Choose Facts
  • 15. 15  Choose theChoose the dimensionsdimensions that apply tothat apply to each fact in the fact table.each fact in the fact table.  Typical dimensions: time, product,Typical dimensions: time, product, geography etc.geography etc.  Identify the descriptive attributes thatIdentify the descriptive attributes that explain each dimension.explain each dimension.  Determine hierarchies within eachDetermine hierarchies within each dimension.dimension. Step 4: Choose DimensionsStep 4: Choose Dimensions
  • 16. 16 Step-4: How to Identify a Dimension?Step-4: How to Identify a Dimension?  The single valued attributesThe single valued attributes during recording of aduring recording of a transactiontransaction are dimensions.are dimensions. Calendar_Date Time_of_Day Account _No ATM_Location Transaction_Type Transaction_RsTransaction_Rs Fact Table Dim Time_of_day:Time_of_day: Morning, Mid Morning, Lunch Break etc. Transaction_Type:Transaction_Type: Withdrawal, Deposit, Check balance etc.
  • 17. 17 Step-4: Can Dimensions be Multi-valued?Step-4: Can Dimensions be Multi-valued?  Are dimensions ALWYS single?Are dimensions ALWYS single?  Not reallyNot really  What are the problems? And how to handle themWhat are the problems? And how to handle them  Calendar_Date (of inspection)  Reg_No  Technician  Workshop  Maintenance_Operation  How many maintenance operations are possible?How many maintenance operations are possible?  FewFew  Maybe more for old cars.Maybe more for old cars.
  • 18. 18 Step-4: Dimensions & GrainStep-4: Dimensions & Grain  Several grains are possible as per businessSeveral grains are possible as per business requirement.requirement.  For some aggregations certain descriptions do notFor some aggregations certain descriptions do not remain atomic.remain atomic.  Example: Time_of_Day may change several timesExample: Time_of_Day may change several times during daily aggregate, but not during a transactionduring daily aggregate, but not during a transaction  Choose the dimensions that are applicableChoose the dimensions that are applicable within the selected grain.within the selected grain.

Editor's Notes

  • #9: <number>
  • #10: <number>
  • #11: <number>
  • #12: <number>
  • #13: <number>