SlideShare a Scribd company logo
8
Most read
9
Most read
13
Most read
Building The Data Warehouse
by Inmon


 Chapter 4: Granularity in the Data Warehouse




                         http://it-slideshares.blogspot.com/
4.0 Introduce - Granularity in the Data Warehouse



  Determining   the proper level of granularity
   of the data that will reside in the data
   warehouse.
  Granularity is important to the warehouse
   architect because it affects all the
   environments that depend on the warehouse
   for data.
4.1 Raw Estimates
  The raw estimate of the number of rows of data that will reside
  in the data warehouse tells the architect a great deal.
4.2 Input to the Planning Process
  The estimate of rows and DASD then serves as input
  to the planning process
4.3 Data in Overflow


Compare the total number of rows in the warehouse environment:
4.3 Data in Overflow (ct)
     There  will be more expertise available in
      managing the data warehouse volumes of data.
     Hardware costs will have dropped to some
      extent.
     More powerful software tools will be available.
     The end user will be more sophisticated.
4.3.1 Overflow Storage
4.3.1 Overflow Storage (ct)
4.4 What the Levels of Granularity Will Be
4.5 Some Feedback Loop Techniques
   Following are techniques to make the feedback
     loop harmonious:
   Build the first parts of the data warehouse in
     very small, very fast steps, and carefully listen to
     the end users’ comments at the end of each
     step of development. Be prepared to make
     adjustments quickly.
   If available, use prototyping and allow the
     feedback loop to function using observations
     gleaned from the prototype.
4.5 Some Feedback Loop Techniques (ct)

   Look  at how other people have built their levels of
    granularity and learn from their experience.
   Go through the feedback process with an experienced user
    who is aware of the process occurring. Under no
    circumstances should you keep your users in the dark as to
    the dynamics of the feedback loop.
   Look at whatever the organization has now that appears to
    be working, and use those functional requirements as a
    guideline.
   Execute joint application design (JAD) sessions and simulate
    the output to achieve the desired feedback.
4.5 Some Feedback Loop Techniques (ct)

  Granularity of data can be raised in many ways, such as the
    following:
   Summarize data from the source as it goes into the target.
   Average or otherwise calculate data as it goes into the
    target.
   Push highest and/or lowest set values into the target.
   Push only data that is obviously needed into the target.
   Use conditional logic to select only a subset of records to
    go into the target.
4.6 Levels of Granularity—Banking Environment
4.6 Levels of Granularity—Banking Environment (ct)
4.6 Levels of Granularity—Banking Environment (ct)
4.6 Levels of Granularity—Banking Environment (ct)
4.6 Levels of Granularity—Banking Environment (ct)
4.6 Levels of Granularity—Banking Environment (ct)
4.7 Feeding the Data Marts



  Specification level of granularity the data
 marts will need.

  The data that resides in the data warehouse
 must be at the lowest level of granularity
 needed by any of the data marts.
4.8 Summary


      Choosing the proper levels of granularity for the architected
       environment is vital to success.
      The worst stance that can be taken is to design all the levels of
       granularity a priori, and then build the data warehouse.
      The process of granularity design begins with a raw estimate of how
       large the warehouse will be on the one-year and the five-year
       horizon.
      There is an important feedback loop for the data warehouse
       environment.
      Another important consideration is the levels of granularity needed
       by the different architectural components that will be fed from the
       data warehouse.



                                   http://it-slideshares.blogspot.com/

More Related Content

PPTX
Hadoop Architecture
PPTX
Introduction to Hadoop
PDF
Actionable Insights with AI - Snowflake for Data Science
PPTX
In-Memory Big Data Analytics
PPTX
What is big data?
PPTX
Gfs vs hdfs
PDF
Building Event Streaming Architectures on Scylla and Kafka
PDF
Understanding HFM System Tables
Hadoop Architecture
Introduction to Hadoop
Actionable Insights with AI - Snowflake for Data Science
In-Memory Big Data Analytics
What is big data?
Gfs vs hdfs
Building Event Streaming Architectures on Scylla and Kafka
Understanding HFM System Tables

What's hot (20)

PDF
Get Mainframe Data to Snowflake’s Cloud Data Warehouse
PPTX
Lakehouse Analytics with Dremio
PPTX
Optimizing Apache Spark SQL Joins
PDF
Keeping Identity Graphs In Sync With Apache Spark
PPT
Hive(ppt)
PDF
Big Data Architecture
PDF
Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...
PDF
Parquet and AVRO
PPTX
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
PDF
CRISP-DM: a data science project methodology
PDF
Simplifying Big Data Analytics with Apache Spark
PPTX
Application Integration: EPM, ERP, Cloud and On-Premise – All options explained
PDF
Distributed Caching Essential Lessons (Ts 1402)
PPT
Map reduce in BIG DATA
PDF
Lecture6 introduction to data streams
KEY
Hadoop, Pig, and Twitter (NoSQL East 2009)
PDF
Data Transformation PowerPoint Presentation Slides
PDF
Hadoop YARN
PDF
Btp presentation
PDF
Azure Cosmos DB
Get Mainframe Data to Snowflake’s Cloud Data Warehouse
Lakehouse Analytics with Dremio
Optimizing Apache Spark SQL Joins
Keeping Identity Graphs In Sync With Apache Spark
Hive(ppt)
Big Data Architecture
Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...
Parquet and AVRO
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
CRISP-DM: a data science project methodology
Simplifying Big Data Analytics with Apache Spark
Application Integration: EPM, ERP, Cloud and On-Premise – All options explained
Distributed Caching Essential Lessons (Ts 1402)
Map reduce in BIG DATA
Lecture6 introduction to data streams
Hadoop, Pig, and Twitter (NoSQL East 2009)
Data Transformation PowerPoint Presentation Slides
Hadoop YARN
Btp presentation
Azure Cosmos DB
Ad

Viewers also liked (20)

PPT
Lecture 03 - The Data Warehouse and Design
PPT
Data Warehousing Datamining Concepts
PPTX
Lecture 02 - The Data Warehouse Environment
PDF
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
PPTX
Big Data Warehousing: Pig vs. Hive Comparison
PDF
Lecture 01 Data Mining
PPT
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
PDF
Big Data Profiling
PPT
Data Warehouse
PDF
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
PPTX
3 tier data warehouse
 
PDF
White Paper - Data Warehouse Documentation Roadmap
PDF
Sample - Data Warehouse Requirements
PDF
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
PPTX
Big Data - The 5 Vs Everyone Must Know
PDF
Data warehouse architecture
PPTX
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
PPTX
Data mining
PPT
Data Warehouse Modeling
PDF
Big Data visualization with Apache Spark and Zeppelin
Lecture 03 - The Data Warehouse and Design
Data Warehousing Datamining Concepts
Lecture 02 - The Data Warehouse Environment
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
Big Data Warehousing: Pig vs. Hive Comparison
Lecture 01 Data Mining
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Big Data Profiling
Data Warehouse
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
3 tier data warehouse
 
White Paper - Data Warehouse Documentation Roadmap
Sample - Data Warehouse Requirements
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Big Data - The 5 Vs Everyone Must Know
Data warehouse architecture
Clinical Data Repository vs. A Data Warehouse - Which Do You Need?
Data mining
Data Warehouse Modeling
Big Data visualization with Apache Spark and Zeppelin
Ad

Similar to Lecture 04 - Granularity in the Data Warehouse (20)

PPTX
BI_LECTURE_4-2021.pptx
PPT
Data mining
PDF
Data warehousing unit 1
PPTX
142230 633685297550892500
PPTX
PDF
3 dw architectures
DOCX
Crucial decisions in designing a data warehouse
PPTX
Data warehouse architecture
PPTX
Data warehousing
PDF
Building the Data Warehouse 3rd Edition W. H. Inmon
PPTX
Data Warehouse for data analytics presentation
PPTX
dataWarehouse.pptx
PPTX
module 1 DWDM (complete) chapter ppt.pptx
PPT
bich-2.ngjfyjdkzxzkckzxzkxzkxkgxjgyityutxjgyutxppt
PPT
Data Warehouse-Final
PPT
7 data warehouse & marts
PDF
Download Complete Building the Data Warehouse 3rd Edition W. H. Inmon PDF fo...
PDF
Building the Data Warehouse 3rd Edition W. H. Inmon
PDF
Data Warehouse - A Practitioner's Overview
PPTX
Data warehouse presentaion
BI_LECTURE_4-2021.pptx
Data mining
Data warehousing unit 1
142230 633685297550892500
3 dw architectures
Crucial decisions in designing a data warehouse
Data warehouse architecture
Data warehousing
Building the Data Warehouse 3rd Edition W. H. Inmon
Data Warehouse for data analytics presentation
dataWarehouse.pptx
module 1 DWDM (complete) chapter ppt.pptx
bich-2.ngjfyjdkzxzkckzxzkxzkxkgxjgyityutxjgyutxppt
Data Warehouse-Final
7 data warehouse & marts
Download Complete Building the Data Warehouse 3rd Edition W. H. Inmon PDF fo...
Building the Data Warehouse 3rd Edition W. H. Inmon
Data Warehouse - A Practitioner's Overview
Data warehouse presentaion

More from phanleson (20)

PDF
Learning spark ch01 - Introduction to Data Analysis with Spark
PPT
Firewall - Network Defense in Depth Firewalls
PPT
Mobile Security - Wireless hacking
PPT
Authentication in wireless - Security in Wireless Protocols
PPT
E-Commerce Security - Application attacks - Server Attacks
PPT
Hacking web applications
PPTX
HBase In Action - Chapter 04: HBase table design
PPT
HBase In Action - Chapter 10 - Operations
PPT
Hbase in action - Chapter 09: Deploying HBase
PPTX
Learning spark ch11 - Machine Learning with MLlib
PPTX
Learning spark ch10 - Spark Streaming
PPTX
Learning spark ch09 - Spark SQL
PPT
Learning spark ch07 - Running on a Cluster
PPTX
Learning spark ch06 - Advanced Spark Programming
PPTX
Learning spark ch05 - Loading and Saving Your Data
PPTX
Learning spark ch04 - Working with Key/Value Pairs
PPTX
Learning spark ch01 - Introduction to Data Analysis with Spark
PPT
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagia
PPT
Lecture 1 - Getting to know XML
PPTX
Lecture 4 - Adding XTHML for the Web
Learning spark ch01 - Introduction to Data Analysis with Spark
Firewall - Network Defense in Depth Firewalls
Mobile Security - Wireless hacking
Authentication in wireless - Security in Wireless Protocols
E-Commerce Security - Application attacks - Server Attacks
Hacking web applications
HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 10 - Operations
Hbase in action - Chapter 09: Deploying HBase
Learning spark ch11 - Machine Learning with MLlib
Learning spark ch10 - Spark Streaming
Learning spark ch09 - Spark SQL
Learning spark ch07 - Running on a Cluster
Learning spark ch06 - Advanced Spark Programming
Learning spark ch05 - Loading and Saving Your Data
Learning spark ch04 - Working with Key/Value Pairs
Learning spark ch01 - Introduction to Data Analysis with Spark
Hướng Dẫn Đăng Ký LibertaGia - A guide and introduciton about Libertagia
Lecture 1 - Getting to know XML
Lecture 4 - Adding XTHML for the Web

Recently uploaded (20)

PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
RMMM.pdf make it easy to upload and study
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Supply Chain Operations Speaking Notes -ICLT Program
Pharma ospi slides which help in ospi learning
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
102 student loan defaulters named and shamed – Is someone you know on the list?
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Week 4 Term 3 Study Techniques revisited.pptx
VCE English Exam - Section C Student Revision Booklet
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPH.pptx obstetrics and gynecology in nursing
RMMM.pdf make it easy to upload and study
human mycosis Human fungal infections are called human mycosis..pptx
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Anesthesia in Laparoscopic Surgery in India
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf

Lecture 04 - Granularity in the Data Warehouse

  • 1. Building The Data Warehouse by Inmon Chapter 4: Granularity in the Data Warehouse http://it-slideshares.blogspot.com/
  • 2. 4.0 Introduce - Granularity in the Data Warehouse Determining the proper level of granularity of the data that will reside in the data warehouse. Granularity is important to the warehouse architect because it affects all the environments that depend on the warehouse for data.
  • 3. 4.1 Raw Estimates The raw estimate of the number of rows of data that will reside in the data warehouse tells the architect a great deal.
  • 4. 4.2 Input to the Planning Process The estimate of rows and DASD then serves as input to the planning process
  • 5. 4.3 Data in Overflow Compare the total number of rows in the warehouse environment:
  • 6. 4.3 Data in Overflow (ct) There will be more expertise available in managing the data warehouse volumes of data. Hardware costs will have dropped to some extent. More powerful software tools will be available. The end user will be more sophisticated.
  • 9. 4.4 What the Levels of Granularity Will Be
  • 10. 4.5 Some Feedback Loop Techniques Following are techniques to make the feedback loop harmonious: Build the first parts of the data warehouse in very small, very fast steps, and carefully listen to the end users’ comments at the end of each step of development. Be prepared to make adjustments quickly. If available, use prototyping and allow the feedback loop to function using observations gleaned from the prototype.
  • 11. 4.5 Some Feedback Loop Techniques (ct)  Look at how other people have built their levels of granularity and learn from their experience.  Go through the feedback process with an experienced user who is aware of the process occurring. Under no circumstances should you keep your users in the dark as to the dynamics of the feedback loop.  Look at whatever the organization has now that appears to be working, and use those functional requirements as a guideline.  Execute joint application design (JAD) sessions and simulate the output to achieve the desired feedback.
  • 12. 4.5 Some Feedback Loop Techniques (ct) Granularity of data can be raised in many ways, such as the following:  Summarize data from the source as it goes into the target.  Average or otherwise calculate data as it goes into the target.  Push highest and/or lowest set values into the target.  Push only data that is obviously needed into the target.  Use conditional logic to select only a subset of records to go into the target.
  • 13. 4.6 Levels of Granularity—Banking Environment
  • 14. 4.6 Levels of Granularity—Banking Environment (ct)
  • 15. 4.6 Levels of Granularity—Banking Environment (ct)
  • 16. 4.6 Levels of Granularity—Banking Environment (ct)
  • 17. 4.6 Levels of Granularity—Banking Environment (ct)
  • 18. 4.6 Levels of Granularity—Banking Environment (ct)
  • 19. 4.7 Feeding the Data Marts  Specification level of granularity the data marts will need.  The data that resides in the data warehouse must be at the lowest level of granularity needed by any of the data marts.
  • 20. 4.8 Summary  Choosing the proper levels of granularity for the architected environment is vital to success.  The worst stance that can be taken is to design all the levels of granularity a priori, and then build the data warehouse.  The process of granularity design begins with a raw estimate of how large the warehouse will be on the one-year and the five-year horizon.  There is an important feedback loop for the data warehouse environment.  Another important consideration is the levels of granularity needed by the different architectural components that will be fed from the data warehouse. http://it-slideshares.blogspot.com/