SlideShare a Scribd company logo
Multidimensional data representation
and manipulation
Data Cube Concepts
2
3
5
6
7
8
Business Analyst Perspective
Employee
Turnover
Management
Location Marketplace
Compensation
Factor / qualitative variable
Outcome Variable /
quantitative variable
- Retention focus
- Flexibility options
- City size
- Transportation options to work
- Demand level
- Salary escalation
- Health care
- Pension
- Base salary levels
9
Data Cube Basics
• Business analyst model
– Factors or influencing variables of interest
– Quantitative variables
– Multidimensional arrangement
• Terminology
– Dimension: subject label for a row or column
Can have more than 2 or 3 dimensions
dimension may be city size or type of health plan offered
– Member: value of dimension
– Measure: quantitative variables/data stored in cells
can have more than one measure in a cell
10
Sales Data Cube Example
11
Notes on Dimensions and Measures
• Hierarchies:
Member can have sub members (more detail)
Location: country, region, state, zip code
• Sparsity:
Many cells are typically empty when dimensions are related
May not sell all products in all regions
Major problem with storing data cubes:
compression of unused space
• Measures:
Derived measure:
Common: unit sales * unit volume; sales per transaction
Data cube engine must compute efficiently
Multiple measures in cells
12
Measure Aggregation Properties
• “Aggregate Property” indicates allowable summary operations
for measures
• Additive
– Summarized by addition across all dimensions such as sales, cost, profit
– Sales can be summed across product, time, customer, …
• Semi-Additive
– Summarized by addition in some but not all dimensions such as time
– Periodic measurements such as account balances and inventory levels
– Account balance can be summed across customer branch
– Account balance cannot be summed across time because balance is just
a point in time measurement
• Non-Additive
– Cannot be summarized by addition through any dimension
– Historical facts such as unit price for a sale
13
Measure Aggregation Example
• Dimensions
– Course: course id, degree, department, and college
– Student: student id, major, department, and college
– Time: semester, academic year, academic decade
• Measures:
– Credit hours
– Grade
– Unit tuition (cost per credit hour)
– Tuition (unit tution * credit hours)
• Aggregation properties for measures: ?
Credit hours: additive across all dimensions
Grade: non additive but averageable such as grade point average
Unit tuition: non additive and averageable but probably not useful as an average
Tuition: additive
14
Sales Data Cube Example
15
16
17
18
19
20
21
22
23
24
Microsoft
Multidimensional Expressions (MDX)
Language History
1997 MS
OLAP
specification
1998 MS
OLAP
services
release
2001 mdXML
specification
by XMLA
Council
MS 2005
MDX revision
• Defacto standard developed by Microsoft and later by
the XMLA (XML for Analysis) Council – A web standards
group
25
MDX Usage
• Foundation for Microsoft products and open source
analytics software
• SQL Server Analysis Services and Excel Pivot
Tables
• Hyperion, IBM, SAP (Systems Applications and
Products), and other vendors
• Foundation for open source projects: JPivot,
Pivot4J, and Pentaho Business Analytics Platform
26
Example MDX Cube
Structure
27
Attribute Hierarchy and Members
Line attribute member
Vendor attribute member
Product attribute
member
Line
Vendor
Product
Product attribute
hierarchy
28
Steel Wheels Cube Display
29
MDX Terminology Notes
• Tuple
– Cell identifier
– One member from each dimension
• Axis: dimension selected in a query (source cube
cells)
• Slicer: combination of dimension members (result
cube cells)
• Querying Multidimensional Data with MDX
– https://docs.microsoft.com/en-us/analysis-
services/multidimensional-models/mdx/querying-
multidimensional-data-with-mdx?view=sql-analysis-
services-2022
30
SQL Versus MDX
• Table result for SQL SELECT statement
• Data cube result for MDX SELECT statement
• Different mathematical approaches for
manipulating tables (e.g. relational algebra) and
data cubes (e.g. matrix algebra)
31
Comparison of Clauses
Language
Clause
SQL MDX
SELECT List of columns List of axis dimensions (source
cube cells)
FROM List of tables Cube name
WHERE Conditions restricting rows Restriction to a combination of
dimension members (result cube
cells)
32
MDX query example :
https://dba.stackexchange.com/questions/138311/good-example-of-mdx-vs-sql-for-analytical-queries
33
Example MDX Statement and Result
- Dimensions in the WHERE clause must be different than the SELECT clause
- WHERE condition is known as a slicer condition
34
CrossJoint Operation
• Combines multiple dimensions or measures on a
single axis
35
Slicer Comparison Examples
36
Pivot Table
• Powerful interface for data cubes
• Convenient rearrangement of row and column headings
• Expand or collapse dimensions
37
Pivot4J
• Allows cube representation similar to pivot table
in Microsoft Excel
• Works with Pentaho Business Analytics
• Separate add-on
• Graphical implementation of the MDX language
38
Pivot4J Interface
39
Pivot Table with MDX Statement
40
Pivot Table with CrossJoin

More Related Content

PDF
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction
PPT
OLAP Cubes in Datawarehousing
PPTX
Analysis services day1
PDF
With big data comes big responsibility
PPTX
OLAP Basics and Fundamentals by Bharat Kalia
PPT
Essbase intro
PDF
Learning Open Source Business Intelligence
PDF
Mdx complex-queries-130019
Microsoft SQL Server Analysis Services (SSAS) - A Practical Introduction
OLAP Cubes in Datawarehousing
Analysis services day1
With big data comes big responsibility
OLAP Basics and Fundamentals by Bharat Kalia
Essbase intro
Learning Open Source Business Intelligence
Mdx complex-queries-130019

Similar to Week 02.pdf (20)

PDF
Business Intelligence and Multidimensional Database
PPT
02 Essbase
PPTX
Mdx 2nddraft
PPTX
MDX (Multi Dimensional Expressions) Introduction
PDF
Multidimensional expressions mdx - reference
PDF
Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)
PPTX
Lecture 06 -IIS-OLAP.pptx
PDF
Adding measures to Calcite SQL
PPT
Sql Server 2005 Business Inteligence
PDF
Cubing and Metrics in SQL, oh my!
PPTX
Project report aditi paul1
PPTX
PPTX
Multidimensional analysis
PPT
Mdx basics
PPT
Mdx Basics
PDF
SSAS, MDX , Cube understanding, Browsing and Tools information
PPT
05 OLAP v6 weekend
PDF
SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...
PDF
A Gentle Introduction to Microsoft SSAS
Business Intelligence and Multidimensional Database
02 Essbase
Mdx 2nddraft
MDX (Multi Dimensional Expressions) Introduction
Multidimensional expressions mdx - reference
Enhancing Dashboard Visuals with Multi-Dimensional Expressions (MDX)
Lecture 06 -IIS-OLAP.pptx
Adding measures to Calcite SQL
Sql Server 2005 Business Inteligence
Cubing and Metrics in SQL, oh my!
Project report aditi paul1
Multidimensional analysis
Mdx basics
Mdx Basics
SSAS, MDX , Cube understanding, Browsing and Tools information
05 OLAP v6 weekend
SQL DAY 2012 | DEV Track | Session 8 - Getting Dimension with Data by C.Tecta...
A Gentle Introduction to Microsoft SSAS
Ad

Recently uploaded (20)

PPTX
Database Infoormation System (DBIS).pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Introduction to machine learning and Linear Models
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
1_Introduction to advance data techniques.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPT
Quality review (1)_presentation of this 21
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Database Infoormation System (DBIS).pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Introduction to machine learning and Linear Models
Reliability_Chapter_ presentation 1221.5784
.pdf is not working space design for the following data for the following dat...
1_Introduction to advance data techniques.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Business Ppt On Nestle.pptx huunnnhhgfvu
STUDY DESIGN details- Lt Col Maksud (21).pptx
ISS -ESG Data flows What is ESG and HowHow
Business Acumen Training GuidePresentation.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Quality review (1)_presentation of this 21
IBA_Chapter_11_Slides_Final_Accessible.pptx
Qualitative Qantitative and Mixed Methods.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Ad

Week 02.pdf

  • 1. Multidimensional data representation and manipulation Data Cube Concepts
  • 2. 2
  • 3. 3
  • 4. 5
  • 5. 6
  • 6. 7
  • 7. 8 Business Analyst Perspective Employee Turnover Management Location Marketplace Compensation Factor / qualitative variable Outcome Variable / quantitative variable - Retention focus - Flexibility options - City size - Transportation options to work - Demand level - Salary escalation - Health care - Pension - Base salary levels
  • 8. 9 Data Cube Basics • Business analyst model – Factors or influencing variables of interest – Quantitative variables – Multidimensional arrangement • Terminology – Dimension: subject label for a row or column Can have more than 2 or 3 dimensions dimension may be city size or type of health plan offered – Member: value of dimension – Measure: quantitative variables/data stored in cells can have more than one measure in a cell
  • 10. 11 Notes on Dimensions and Measures • Hierarchies: Member can have sub members (more detail) Location: country, region, state, zip code • Sparsity: Many cells are typically empty when dimensions are related May not sell all products in all regions Major problem with storing data cubes: compression of unused space • Measures: Derived measure: Common: unit sales * unit volume; sales per transaction Data cube engine must compute efficiently Multiple measures in cells
  • 11. 12 Measure Aggregation Properties • “Aggregate Property” indicates allowable summary operations for measures • Additive – Summarized by addition across all dimensions such as sales, cost, profit – Sales can be summed across product, time, customer, … • Semi-Additive – Summarized by addition in some but not all dimensions such as time – Periodic measurements such as account balances and inventory levels – Account balance can be summed across customer branch – Account balance cannot be summed across time because balance is just a point in time measurement • Non-Additive – Cannot be summarized by addition through any dimension – Historical facts such as unit price for a sale
  • 12. 13 Measure Aggregation Example • Dimensions – Course: course id, degree, department, and college – Student: student id, major, department, and college – Time: semester, academic year, academic decade • Measures: – Credit hours – Grade – Unit tuition (cost per credit hour) – Tuition (unit tution * credit hours) • Aggregation properties for measures: ? Credit hours: additive across all dimensions Grade: non additive but averageable such as grade point average Unit tuition: non additive and averageable but probably not useful as an average Tuition: additive
  • 14. 15
  • 15. 16
  • 16. 17
  • 17. 18
  • 18. 19
  • 19. 20
  • 20. 21
  • 21. 22
  • 22. 23
  • 23. 24 Microsoft Multidimensional Expressions (MDX) Language History 1997 MS OLAP specification 1998 MS OLAP services release 2001 mdXML specification by XMLA Council MS 2005 MDX revision • Defacto standard developed by Microsoft and later by the XMLA (XML for Analysis) Council – A web standards group
  • 24. 25 MDX Usage • Foundation for Microsoft products and open source analytics software • SQL Server Analysis Services and Excel Pivot Tables • Hyperion, IBM, SAP (Systems Applications and Products), and other vendors • Foundation for open source projects: JPivot, Pivot4J, and Pentaho Business Analytics Platform
  • 26. 27 Attribute Hierarchy and Members Line attribute member Vendor attribute member Product attribute member Line Vendor Product Product attribute hierarchy
  • 28. 29 MDX Terminology Notes • Tuple – Cell identifier – One member from each dimension • Axis: dimension selected in a query (source cube cells) • Slicer: combination of dimension members (result cube cells) • Querying Multidimensional Data with MDX – https://docs.microsoft.com/en-us/analysis- services/multidimensional-models/mdx/querying- multidimensional-data-with-mdx?view=sql-analysis- services-2022
  • 29. 30 SQL Versus MDX • Table result for SQL SELECT statement • Data cube result for MDX SELECT statement • Different mathematical approaches for manipulating tables (e.g. relational algebra) and data cubes (e.g. matrix algebra)
  • 30. 31 Comparison of Clauses Language Clause SQL MDX SELECT List of columns List of axis dimensions (source cube cells) FROM List of tables Cube name WHERE Conditions restricting rows Restriction to a combination of dimension members (result cube cells)
  • 31. 32 MDX query example : https://dba.stackexchange.com/questions/138311/good-example-of-mdx-vs-sql-for-analytical-queries
  • 32. 33 Example MDX Statement and Result - Dimensions in the WHERE clause must be different than the SELECT clause - WHERE condition is known as a slicer condition
  • 33. 34 CrossJoint Operation • Combines multiple dimensions or measures on a single axis
  • 35. 36 Pivot Table • Powerful interface for data cubes • Convenient rearrangement of row and column headings • Expand or collapse dimensions
  • 36. 37 Pivot4J • Allows cube representation similar to pivot table in Microsoft Excel • Works with Pentaho Business Analytics • Separate add-on • Graphical implementation of the MDX language
  • 38. 39 Pivot Table with MDX Statement
  • 39. 40 Pivot Table with CrossJoin