DIMENSIONAL
MODELING
Structuring Data for Better
Reporting and Analysis
Sajjad Zaheer
21 Aug 2014, Folio3
@folio_3 www.folio3.com Copyright 2015
1. Getting into the Context
@folio_3 www.folio3.com Copyright 2015
Online Transaction Processing
• Core database
• Usually ER model
• For transactions and routine tasks
@folio_3 www.folio3.com Copyright 2015
Data about data, i.e information about data tables
in OLTP System.
@folio_3 www.folio3.com Copyright 2015
Extract from source (OLTP)
Transform, according to requirement
Load into Data Warehouse
@folio_3 www.folio3.com Copyright 2015
• For effective querying, analysis and decision-
making
• OLAP (Online Analytical Processing) Design
• Subject-oriented, Integrated, Time-varying, non-
volatile collection of data
@folio_3 www.folio3.com Copyright 2015
• Access layer of data warehouse
• Subset of data ware house
• Oriented to specific business unit or department
E.g. marketing
• Is not another physical entity
@folio_3 www.folio3.com Copyright 2015
To analyze multidimensional data interactively
from multiple perspectives
@folio_3 www.folio3.com Copyright 2015
• Computational process of discovering patterns in
large data sets.
• To extract information and transform it into an
understandable structure for further use.
@folio_3 www.folio3.com Copyright 2015
Creation and study of the visual representation
of data E.g. scatter plot, bar chart.
@folio_3 www.folio3.com Copyright 2015
Retrieve and present a subset of data for a
particular purpose
@folio_3 www.folio3.com Copyright 2015
Dat
a
Informatio
n
Knowledg
e
Dimensional
Modeling (OLTP to
OLAP Structure)
@folio_3 www.folio3.com Copyright 2015
Dimensional Modeling
@folio_3 www.folio3.com Copyright 2015
@folio_3 www.folio3.com Copyright 2015
@folio_3 www.folio3.com Copyright 2015
Terminology
Dimensions
 The time independent,
textual and descriptive
attributes by which
users describe
objects.
 Who, where, what,
how, when
 Angles/Dimensions
with which a data can
be viewed.
 E.g. Product category,
Date-time of a
transaction.
Facts
 Business
Measurements
(Quantified). E.g.
quantity, amount, cost,
taxes,
 Things that can be
summed or
aggregated. E.g. sales
of a product.
 Built from the lowest
level of detail (grain)
 Data at consideration
 Time dependent@folio_3 www.folio3.com Copyright 2015
Dimensional Modeling Process
 Sub-setting
 De-normalization
 i.e. collapsing hierarchies of dimensions by de-
normalization to 2NF
 Summarization
 i.e. Summation of Facts
@folio_3 www.folio3.com Copyright 2015
Modeling Design Steps
1. Identify the Business Process
 Source of “measurements”
2. Identify the Grain
 What does 1 row in the fact table represent or mean?
3. Identify the Dimensions
 Descriptive context, true to the grain
4. Identify the Facts
 Numeric additive measurements, true to the grain
@folio_3 www.folio3.com Copyright 2015
Design Steps - Example
@folio_3 www.folio3.com Copyright 2015
Case Study: Users Points
System
 Consider a System simply explained as:
 It has users and groups of users.
 Every user can perform certain actions like
message, comment, meeting etc.
 For every action user get some points that are
also added to the points of user groups that this
user belongs.
 The system also has many other features that are
not relevant to points.
 Let’s assume the system has over 100 tables to
store various things.
@folio_3 www.folio3.com Copyright 2015
Step 1: Identify the Business
Process
 Question 1: Do we start doing dimensional
modeling to all the 100 tables in the system?
 Answer: No
 Question 2: So which tables should be
selected?
 Answer: The tables that are relevant to the
business requirements.
@folio_3 www.folio3.com Copyright 2015
Business Requirements
 Three types of points are required for
reporting:
1. Per month points
2. Average lifetime points at end of each month
 For:
1. Individual users
2. User groups
3. Individual users per action
4. User groups per action
@folio_3 www.folio3.com Copyright 2015
Step 2: Identify the Grain
Analyzing the business requirements, following
grains are identified.
1. Points per individual per month
2. Points per user group per month
3. Points per user per action per month
4. Average Lifetime Points per individual per month
5. Average Lifetime Points per user group per
month
6. Average Lifetime Points per user per action per
month
“Grain = What does 1 row in the fact table
represent”
@folio_3 www.folio3.com Copyright 2015
Step 3: Identify the Dimensions
Simply speaking, the content after ‘per’ in
grain are the dimensions. They are found to
be:
1. Date (granularity: month)
2. Uses
3. User groups
4. Actions
“Dimension: descriptive context true to grain”
@folio_3 www.folio3.com Copyright 2015
Step 4: Identify the Facts
4 Facts are identified
1. User Points
2. User Lifetime Average Points
3. User Group Points
4. User Group Lifetime Average Points
“Facts: Numeric additive measures true to grain”
@folio_3 www.folio3.com Copyright 2015
Tables Schema
Once Grain, facts and dimensions are identified,
table schema is to be formed using these.
Please note:
 It is not necessary to keep all facts in different
tables.
 They can be part of single table.
 Alternatively, there can be multiple fact tables for
a single fact as per its relationship with
dimensions.
 Every dimension will be in different table and each
dimension can be connected to many fact tables.@folio_3 www.folio3.com Copyright 2015
Tables Schema
 Tables Schema should be the translation of
the Grain defined in step 2
@folio_3 www.folio3.com Copyright 2015
Star Schema – fact_points_user
Grains covered:
1. Points per individual per month
2. Average lifetime points per individual per month
@folio_3 www.folio3.com Copyright 2015
Star Schema –
fact_points_user_action
Grains covered:
1. Points per individual per action
per month
2. Average lifetime points per
individual per action per month
@folio_3 www.folio3.com Copyright 2015
Star Schema –
fact_points_group
Grains covered:
1. Points per user group per month
2. Average lifetime points per user group per
month
@folio_3 www.folio3.com Copyright 2015
Star Schema for User Points
Grains
Grains covered:
1. Points per user group per action
per month
2. Average lifetime points per user
group per action per month
@folio_3 www.folio3.com Copyright 2015
Example Query
SELECT fp.*, du.username, da.action_name
FROM fact_points_user_action fp
JOIN dim_user du ON fp.dim_user_id = du.dim_user_id
JOIN dim_date dd ON fp.dim_date_id = dd.dim_date_id
JOIN dim_action da ON fp.dim_action_id =
da.dim_action_id
WHERE dd.month = 3 AND dd.year = 2014;
@folio_3 www.folio3.com Copyright 2015
Data Transformation: OLTP to
OLAP
@folio_3 www.folio3.com Copyright 2015
Data Transformation
 Once the OLAP Schema has been designed, data
is to be moved from the ERD (OLTP) DB to this
new OLAP DB.
 This can be achieved using dedicated scripts or
cron jobs.
 One simple example for the elaborated case is to
set up a cron that gets executed at every month
end and move relevant data from ERD DB to
OLAP DB after calculations (if any).
@folio_3 www.folio3.com Copyright 2015
Conclusion
 Dimensional Modeling helps to keep data in a
form that is relevant and quickly accessible for
reporting and analysis.
@folio_3 www.folio3.com Copyright 2015

More Related Content

PPTX
Dimensional Modeling Basic Concept with Example
PPTX
Introduction to Dimesional Modelling
ODP
Dimensional Modelling
PDF
Dimensional modeling primer
DOCX
Dimensional data model
PPTX
Dimensional Modeling
PPTX
Slowly changing dimension
PPT
DW DIMENSN MODELNG
Dimensional Modeling Basic Concept with Example
Introduction to Dimesional Modelling
Dimensional Modelling
Dimensional modeling primer
Dimensional data model
Dimensional Modeling
Slowly changing dimension
DW DIMENSN MODELNG

What's hot (20)

PPT
Dimensional modelling-mod-3
PPT
An introduction to data warehousing
PPTX
Data warehouse logical design
PPT
Dimensional Modelling Session 2
DOCX
Designing the business process dimensional model
PPT
Dimensional Modeling
DOC
Difference between ER-Modeling and Dimensional Modeling
PPTX
Introduction to Data Warehousing
PPTX
Advanced Dimensional Modelling
PDF
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
PPTX
Classification and prediction in data mining
PDF
Difference between fact tables and dimension tables
PPTX
Fact table facts
PDF
Data warehousing and business intelligence project report
PPT
Data warehousing and online analytical processing
PPTX
Data Warehouse by Amr Ali
PPT
Data mininng trends
PPT
Data Preprocessing
PPTX
Introduction Data warehouse
PDF
International Refereed Journal of Engineering and Science (IRJES)
Dimensional modelling-mod-3
An introduction to data warehousing
Data warehouse logical design
Dimensional Modelling Session 2
Designing the business process dimensional model
Dimensional Modeling
Difference between ER-Modeling and Dimensional Modeling
Introduction to Data Warehousing
Advanced Dimensional Modelling
Data Warehouse Designing: Dimensional Modelling and E-R Modelling
Classification and prediction in data mining
Difference between fact tables and dimension tables
Fact table facts
Data warehousing and business intelligence project report
Data warehousing and online analytical processing
Data Warehouse by Amr Ali
Data mininng trends
Data Preprocessing
Introduction Data warehouse
International Refereed Journal of Engineering and Science (IRJES)
Ad

Viewers also liked (16)

PPT
Offline Data Access in Enterprise Mobility
PPTX
Front End Oprtimization
PPT
Introduction to Docker
PPTX
Cache is king
PPTX
Web Application Security - Folio3
PDF
An Overview of Blackberry 10 & Blackberry App Development
PPT
Web Performance & Scalability Tools
PPT
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
PPT
An Introduction to CSS Preprocessors (SASS & LESS)
PPT
An Overview of Blackberry 10
PPT
Introduction to SharePoint 2013
PDF
Cross Platform Mobile App Development - An Introduction to Sencha Touch
PPT
Enterprise Mobility - An Introduction
PPT
Introduction to Go-Lang
PPT
Best Practices of Software Development
PPT
NOSQL Database: Apache Cassandra
Offline Data Access in Enterprise Mobility
Front End Oprtimization
Introduction to Docker
Cache is king
Web Application Security - Folio3
An Overview of Blackberry 10 & Blackberry App Development
Web Performance & Scalability Tools
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
An Introduction to CSS Preprocessors (SASS & LESS)
An Overview of Blackberry 10
Introduction to SharePoint 2013
Cross Platform Mobile App Development - An Introduction to Sencha Touch
Enterprise Mobility - An Introduction
Introduction to Go-Lang
Best Practices of Software Development
NOSQL Database: Apache Cassandra
Ad

Similar to Dimensional Modelling - Basic Concept (20)

PPTX
Lecture 08B - Logical-DWH-Model-Pending.pptx
PPT
Dimensional Modeling For engineering drawings.ppt
PPT
Lecture 14
PPT
Lecture 3F.ppt
PPTX
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
ODP
04 Dimensional Analysis - v6
PPT
Intro to Data warehousing lecture 08
PPT
Dwh lecture slides-week10
PPT
Dwh lecture 13-process dm
PPT
Dimensional Modeling
PDF
Bi 7 (1)
PDF
Bi 6
PDF
Data Warehouse Back to Basics: Dimensional Modeling
PPTX
Module 1.2: Data Warehousing Fundamentals.pptx
PPTX
Lecture 3:Introduction to Dimensional Modelling.pptx
PPT
Modelado Dimensional 4 etapas.ppt
PDF
Data Mining - Introduction and Data
PDF
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
PPT
Data Warehouse Modeling
PDF
On multi dimensional cubes of census data: designing and querying
Lecture 08B - Logical-DWH-Model-Pending.pptx
Dimensional Modeling For engineering drawings.ppt
Lecture 14
Lecture 3F.ppt
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
04 Dimensional Analysis - v6
Intro to Data warehousing lecture 08
Dwh lecture slides-week10
Dwh lecture 13-process dm
Dimensional Modeling
Bi 7 (1)
Bi 6
Data Warehouse Back to Basics: Dimensional Modeling
Module 1.2: Data Warehousing Fundamentals.pptx
Lecture 3:Introduction to Dimensional Modelling.pptx
Modelado Dimensional 4 etapas.ppt
Data Mining - Introduction and Data
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data Warehouse Modeling
On multi dimensional cubes of census data: designing and querying

More from Folio3 Software (14)

PPT
Shopify & Shopify Plus Ecommerce Development Experts
PPT
Magento and Magento 2 Ecommerce Development
PPTX
All You Need to Know About Type Script
PPT
Enter the Big Picture
PPT
A Guideline to Test Your Own Code - Developer Testing
PPT
OWIN (Open Web Interface for .NET)
PPT
StackOverflow Architectural Overview
PPT
Introduction to Enterprise Service Bus
PPT
Regular Expression in Action
PPT
HTTP Server Push Techniques
PPT
Realtime and Synchronous Applications
PPT
Andriod - Technical Review
PPT
NetSuite Integration Solutions - Folio3
PDF
Folio3 - An Introduction to PHP Yii
Shopify & Shopify Plus Ecommerce Development Experts
Magento and Magento 2 Ecommerce Development
All You Need to Know About Type Script
Enter the Big Picture
A Guideline to Test Your Own Code - Developer Testing
OWIN (Open Web Interface for .NET)
StackOverflow Architectural Overview
Introduction to Enterprise Service Bus
Regular Expression in Action
HTTP Server Push Techniques
Realtime and Synchronous Applications
Andriod - Technical Review
NetSuite Integration Solutions - Folio3
Folio3 - An Introduction to PHP Yii

Recently uploaded (20)

PPTX
IMPACT OF LANDSLIDE.....................
PDF
©️ 01_Algorithm for Microsoft New Product Launch - handling web site - by Ale...
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PDF
Global Data and Analytics Market Outlook Report
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Loose-Leaf for Auditing & Assurance Services A Systematic Approach 11th ed. E...
PPTX
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
PDF
Session 11 - Data Visualization Storytelling (2).pdf
PPTX
ai agent creaction with langgraph_presentation_
PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
The Data Security Envisioning Workshop provides a summary of an organization...
PPTX
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
PPTX
Tapan_20220802057_Researchinternship_final_stage.pptx
PPTX
1 hour to get there before the game is done so you don’t need a car seat for ...
PPTX
recommendation Project PPT with details attached
PPTX
chrmotography.pptx food anaylysis techni
PPTX
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
PPTX
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
PPTX
eGramSWARAJ-PPT Training Module for beginners
PPTX
Statisticsccdxghbbnhhbvvvvvvvvvv. Dxcvvvhhbdzvbsdvvbbvv ccc
IMPACT OF LANDSLIDE.....................
©️ 01_Algorithm for Microsoft New Product Launch - handling web site - by Ale...
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Global Data and Analytics Market Outlook Report
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Loose-Leaf for Auditing & Assurance Services A Systematic Approach 11th ed. E...
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
Session 11 - Data Visualization Storytelling (2).pdf
ai agent creaction with langgraph_presentation_
CYBER SECURITY the Next Warefare Tactics
The Data Security Envisioning Workshop provides a summary of an organization...
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
Tapan_20220802057_Researchinternship_final_stage.pptx
1 hour to get there before the game is done so you don’t need a car seat for ...
recommendation Project PPT with details attached
chrmotography.pptx food anaylysis techni
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
eGramSWARAJ-PPT Training Module for beginners
Statisticsccdxghbbnhhbvvvvvvvvvv. Dxcvvvhhbdzvbsdvvbbvv ccc

Dimensional Modelling - Basic Concept

  • 1. DIMENSIONAL MODELING Structuring Data for Better Reporting and Analysis Sajjad Zaheer 21 Aug 2014, Folio3 @folio_3 www.folio3.com Copyright 2015
  • 2. 1. Getting into the Context @folio_3 www.folio3.com Copyright 2015
  • 3. Online Transaction Processing • Core database • Usually ER model • For transactions and routine tasks @folio_3 www.folio3.com Copyright 2015
  • 4. Data about data, i.e information about data tables in OLTP System. @folio_3 www.folio3.com Copyright 2015
  • 5. Extract from source (OLTP) Transform, according to requirement Load into Data Warehouse @folio_3 www.folio3.com Copyright 2015
  • 6. • For effective querying, analysis and decision- making • OLAP (Online Analytical Processing) Design • Subject-oriented, Integrated, Time-varying, non- volatile collection of data @folio_3 www.folio3.com Copyright 2015
  • 7. • Access layer of data warehouse • Subset of data ware house • Oriented to specific business unit or department E.g. marketing • Is not another physical entity @folio_3 www.folio3.com Copyright 2015
  • 8. To analyze multidimensional data interactively from multiple perspectives @folio_3 www.folio3.com Copyright 2015
  • 9. • Computational process of discovering patterns in large data sets. • To extract information and transform it into an understandable structure for further use. @folio_3 www.folio3.com Copyright 2015
  • 10. Creation and study of the visual representation of data E.g. scatter plot, bar chart. @folio_3 www.folio3.com Copyright 2015
  • 11. Retrieve and present a subset of data for a particular purpose @folio_3 www.folio3.com Copyright 2015
  • 12. Dat a Informatio n Knowledg e Dimensional Modeling (OLTP to OLAP Structure) @folio_3 www.folio3.com Copyright 2015
  • 16. Terminology Dimensions  The time independent, textual and descriptive attributes by which users describe objects.  Who, where, what, how, when  Angles/Dimensions with which a data can be viewed.  E.g. Product category, Date-time of a transaction. Facts  Business Measurements (Quantified). E.g. quantity, amount, cost, taxes,  Things that can be summed or aggregated. E.g. sales of a product.  Built from the lowest level of detail (grain)  Data at consideration  Time dependent@folio_3 www.folio3.com Copyright 2015
  • 17. Dimensional Modeling Process  Sub-setting  De-normalization  i.e. collapsing hierarchies of dimensions by de- normalization to 2NF  Summarization  i.e. Summation of Facts @folio_3 www.folio3.com Copyright 2015
  • 18. Modeling Design Steps 1. Identify the Business Process  Source of “measurements” 2. Identify the Grain  What does 1 row in the fact table represent or mean? 3. Identify the Dimensions  Descriptive context, true to the grain 4. Identify the Facts  Numeric additive measurements, true to the grain @folio_3 www.folio3.com Copyright 2015
  • 19. Design Steps - Example @folio_3 www.folio3.com Copyright 2015
  • 20. Case Study: Users Points System  Consider a System simply explained as:  It has users and groups of users.  Every user can perform certain actions like message, comment, meeting etc.  For every action user get some points that are also added to the points of user groups that this user belongs.  The system also has many other features that are not relevant to points.  Let’s assume the system has over 100 tables to store various things. @folio_3 www.folio3.com Copyright 2015
  • 21. Step 1: Identify the Business Process  Question 1: Do we start doing dimensional modeling to all the 100 tables in the system?  Answer: No  Question 2: So which tables should be selected?  Answer: The tables that are relevant to the business requirements. @folio_3 www.folio3.com Copyright 2015
  • 22. Business Requirements  Three types of points are required for reporting: 1. Per month points 2. Average lifetime points at end of each month  For: 1. Individual users 2. User groups 3. Individual users per action 4. User groups per action @folio_3 www.folio3.com Copyright 2015
  • 23. Step 2: Identify the Grain Analyzing the business requirements, following grains are identified. 1. Points per individual per month 2. Points per user group per month 3. Points per user per action per month 4. Average Lifetime Points per individual per month 5. Average Lifetime Points per user group per month 6. Average Lifetime Points per user per action per month “Grain = What does 1 row in the fact table represent” @folio_3 www.folio3.com Copyright 2015
  • 24. Step 3: Identify the Dimensions Simply speaking, the content after ‘per’ in grain are the dimensions. They are found to be: 1. Date (granularity: month) 2. Uses 3. User groups 4. Actions “Dimension: descriptive context true to grain” @folio_3 www.folio3.com Copyright 2015
  • 25. Step 4: Identify the Facts 4 Facts are identified 1. User Points 2. User Lifetime Average Points 3. User Group Points 4. User Group Lifetime Average Points “Facts: Numeric additive measures true to grain” @folio_3 www.folio3.com Copyright 2015
  • 26. Tables Schema Once Grain, facts and dimensions are identified, table schema is to be formed using these. Please note:  It is not necessary to keep all facts in different tables.  They can be part of single table.  Alternatively, there can be multiple fact tables for a single fact as per its relationship with dimensions.  Every dimension will be in different table and each dimension can be connected to many fact tables.@folio_3 www.folio3.com Copyright 2015
  • 27. Tables Schema  Tables Schema should be the translation of the Grain defined in step 2 @folio_3 www.folio3.com Copyright 2015
  • 28. Star Schema – fact_points_user Grains covered: 1. Points per individual per month 2. Average lifetime points per individual per month @folio_3 www.folio3.com Copyright 2015
  • 29. Star Schema – fact_points_user_action Grains covered: 1. Points per individual per action per month 2. Average lifetime points per individual per action per month @folio_3 www.folio3.com Copyright 2015
  • 30. Star Schema – fact_points_group Grains covered: 1. Points per user group per month 2. Average lifetime points per user group per month @folio_3 www.folio3.com Copyright 2015
  • 31. Star Schema for User Points Grains Grains covered: 1. Points per user group per action per month 2. Average lifetime points per user group per action per month @folio_3 www.folio3.com Copyright 2015
  • 32. Example Query SELECT fp.*, du.username, da.action_name FROM fact_points_user_action fp JOIN dim_user du ON fp.dim_user_id = du.dim_user_id JOIN dim_date dd ON fp.dim_date_id = dd.dim_date_id JOIN dim_action da ON fp.dim_action_id = da.dim_action_id WHERE dd.month = 3 AND dd.year = 2014; @folio_3 www.folio3.com Copyright 2015
  • 33. Data Transformation: OLTP to OLAP @folio_3 www.folio3.com Copyright 2015
  • 34. Data Transformation  Once the OLAP Schema has been designed, data is to be moved from the ERD (OLTP) DB to this new OLAP DB.  This can be achieved using dedicated scripts or cron jobs.  One simple example for the elaborated case is to set up a cron that gets executed at every month end and move relevant data from ERD DB to OLAP DB after calculations (if any). @folio_3 www.folio3.com Copyright 2015
  • 35. Conclusion  Dimensional Modeling helps to keep data in a form that is relevant and quickly accessible for reporting and analysis. @folio_3 www.folio3.com Copyright 2015

Editor's Notes

  • #4: Photo source:
  • #5: Photo source:
  • #6: Photo source:
  • #7: Photo source:
  • #8: Photo source:
  • #9: Photo source:
  • #10: Photo source:
  • #11: Photo source:
  • #12: Photo source:
  • #13: Photo source: