SlideShare a Scribd company logo
2
Most read
Sanjivani Rural Education Society’s
Sanjivani College of Engineering, Kopargaon-423 603
(An Autonomous Institute, Affiliated to Savitribai Phule Pune University, Pune)
NACC ‘A’ Grade Accredited, ISO 9001:2015 Certified
Department of Computer Engineering
(NBA Accredited)
Prof. S.A.Shivarkar
Assistant Professor
E-mail :
shivarkarsandipcomp@sanjivani.org.in
Contact No: 8275032712
Subject- Business Intelligence
Unit-II: Data Warehouse
Course Contents
DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 2
Unit-III :Data Warehouse
Introduction: Data warehouse Modelling, data warehouse
design, data-ware-house technology, Distributed data
warehouse, and materialized view
Data Warehouse
 A data warehouse (DW) is a pool of data produced
to support decision making; it is also a repository
of current and historical data of potential interest
to managers throughout the organization.
 Data are usually structured to be available in a
form ready for analytical processing activities (i.e.,
online analytical processing [OLAP], data mining,
querying, reporting, and other decision support
applications). A data warehouse is a subject-
oriented, integrated, time-variant, nonvolatile
collection of data in support of management's
decision-making process.
Data Warehouse Characteristics
 Subject oriented
 Integrated
 Time variant (time series)
 Nonvolatile
 Web based
 Relational/multidimensional
 Client server
 Real time
 Include metadata
Data Marts
 A data mart is a subset of a data warehouse, typically
consisting of a single subject area (e.g., marketing,
operations).
 A dependent data mart is a subset that is created
directly from the data warehouse. It has the
advantages of using a consistent data model and
providing quality data.
 Dependent data marts support the concept of a
single enterprise-wide data model, but the data
warehouse must be constructed first.
 A dependent data mart ensures that the end user is
viewing the same version of the data that is accessed
Data Marts
 A dependent data mart ensures that the end user is
viewing the same version of the data that is accessed
by all other data warehouse users.
 The high cost of data warehouses limits their use to
large companies.
 As an alternative, many firms use a lower-cost,
scaled-down version of a data warehouse referred to
as an independent data mart. An independent data
mart is a small warehouse designed for a strategic
business unit (SBU) or a department, but its source
is not an EDW.
Operational Data Stores
 An operational data store (ODS) provides a fairly
recent form of customer information file (CIF).
 This type of database is often used as an interim
staging area for a data warehouse.
 Unlike the static contents of a data warehouse, the
contents of an ODS are updated throughout the
course of business operations.
 An ODS is used for short-term decisions involving
mission-critical applications rather than for the
medium- and long-term decisions associated with an
EDW. An ODS is similar to short-term memory in that
it stores only very recent information
Enterprise Data Warehouses (EDW)
 An enterprise data warehouse (EDW) is a large-scale
data warehouse that is used across the enterprise for
decision support.
 It is the type of data warehouse that Isle of Capri
developed, as described in the opening vignette.
 The large-scale nature provides integration of data
from many sources into a standard format for
effective BI and decision support applications.
 EDW are used to provide data for many types of DSS,
including CRM, supply chain management (SCM) etc.
Meta Data
 Metadata are data about data.
 Metadata describe the structure of and some meaning
about data, thereby contributing to their effective or
ineffective use.
Importance of Meta Data
 A documentation of the data warehouse structure:
layout, logical views, dimensions, hierarchies,
derived data, localization of any data mart;
 A documentation of the data genealogy, obtained by
tagging the data sources from which data were
extracted and by describing any transformation
performed on the data themselves;
Importance of Meta Data
 A list keeping the usage statistics of the data
warehouse, by indicating how many accesses to a
field or to a logical view have been performed;
 A documentation of the general meaning of the
data warehouse with respect to the application
domain, by providing the definition of the terms
utilized, and fully describing data properties, data
ownership and loading policies.

More Related Content

PPT
DW 101
PPTX
DATA WAREHOUSING
PDF
Introduction to Data Warehouse
PPTX
DATAWAREHOUSE MAIn under data mining for
PPTX
MIS and Business Functions, TPS/DSS/ESS, MIS and Business Processes, Impact o...
PDF
Top 60+ Data Warehouse Interview Questions and Answers.pdf
PPT
Dataware housing
PPT
11667 Bitt I 2008 Lect4
DW 101
DATA WAREHOUSING
Introduction to Data Warehouse
DATAWAREHOUSE MAIn under data mining for
MIS and Business Functions, TPS/DSS/ESS, MIS and Business Processes, Impact o...
Top 60+ Data Warehouse Interview Questions and Answers.pdf
Dataware housing
11667 Bitt I 2008 Lect4

Similar to Unit III Introduction to DWH.pdf (20)

PPTX
Data warehouse presentaion
PDF
Chapter-6_BasicsOfDataIntegrationbibibibini.pdf
PDF
Data Warehousing & Basic Architectural Framework
PDF
PPT
Datawarehousing
PDF
Database vs Data Warehouse- Key Differences
PPTX
Business Intelligence Module 3_Datawarehousing.pptx
PPTX
Data warehouse
PPTX
Introduction to Data Warehouse Modelling
PPT
20IT501_DWDM_PPT_Unit_I.ppt
PPT
20IT501_DWDM_PPT_Unit_I.ppt
PPT
Datawarehousing
PDF
Data warehousing
PDF
Rando Veizi: Data warehouse and Pentaho suite
PPTX
Database-Management-Systems-An-Introduction (1).pptx
PPTX
Data warehouse
PPT
dw_concepts_2_day_course.ppt
DOCX
Unit 1
PPTX
Data Warehousing fundamental for data engineering
PPTX
Data Warehousing .pptx
Data warehouse presentaion
Chapter-6_BasicsOfDataIntegrationbibibibini.pdf
Data Warehousing & Basic Architectural Framework
Datawarehousing
Database vs Data Warehouse- Key Differences
Business Intelligence Module 3_Datawarehousing.pptx
Data warehouse
Introduction to Data Warehouse Modelling
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
Datawarehousing
Data warehousing
Rando Veizi: Data warehouse and Pentaho suite
Database-Management-Systems-An-Introduction (1).pptx
Data warehouse
dw_concepts_2_day_course.ppt
Unit 1
Data Warehousing fundamental for data engineering
Data Warehousing .pptx
Ad

More from ShivarkarSandip (20)

PDF
MEASURES OF DATA: SCALE, TENDENCY, VARIATION SHAPE
PDF
STATISTICS AND PROBABILITY FOR DATA SCIENCE,
PDF
Introduction to Data Science: data science process
PDF
Prerquisite for Data Sciecne, KDD, Attribute Type
PDF
NBaysian classifier, Naive Bayes classifier
PDF
Supervised Learning Ensemble Techniques Machine Learning
PDF
Microcontroller 8051- Architecture Memory Organization
PDF
Data Preprocessing -Data Quality Noisy Data
PDF
Supervised Learning Decision Trees Review of Entropy
PDF
Supervised Learning Decision Trees Machine Learning
PDF
Cluster Analysis: Measuring Similarity & Dissimilarity
PDF
Classification, Attribute Selection, Classifiers- Decision Tree, ID3,C4.5,Nav...
PDF
Frequent Pattern Analysis, Apriori and FP Growth Algorithm
PDF
Data Warehouse and Architecture, OLAP Operation
PDF
Data Preparation and Preprocessing , Data Cleaning
PDF
Introduction to Data Mining, KDD Process, OLTP and OLAP
PDF
Introduction to Data Mining KDD Process OLAP
PDF
Issues in data mining Patterns Online Analytical Processing
PDF
Introduction to data mining which covers the basics
PDF
Introduction to Data Communication.pdf
MEASURES OF DATA: SCALE, TENDENCY, VARIATION SHAPE
STATISTICS AND PROBABILITY FOR DATA SCIENCE,
Introduction to Data Science: data science process
Prerquisite for Data Sciecne, KDD, Attribute Type
NBaysian classifier, Naive Bayes classifier
Supervised Learning Ensemble Techniques Machine Learning
Microcontroller 8051- Architecture Memory Organization
Data Preprocessing -Data Quality Noisy Data
Supervised Learning Decision Trees Review of Entropy
Supervised Learning Decision Trees Machine Learning
Cluster Analysis: Measuring Similarity & Dissimilarity
Classification, Attribute Selection, Classifiers- Decision Tree, ID3,C4.5,Nav...
Frequent Pattern Analysis, Apriori and FP Growth Algorithm
Data Warehouse and Architecture, OLAP Operation
Data Preparation and Preprocessing , Data Cleaning
Introduction to Data Mining, KDD Process, OLTP and OLAP
Introduction to Data Mining KDD Process OLAP
Issues in data mining Patterns Online Analytical Processing
Introduction to data mining which covers the basics
Introduction to Data Communication.pdf
Ad

Recently uploaded (20)

PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PDF
Categorization of Factors Affecting Classification Algorithms Selection
PPTX
communication and presentation skills 01
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PDF
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
PPTX
Information Storage and Retrieval Techniques Unit III
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
PPTX
Current and future trends in Computer Vision.pptx
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PDF
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
PPT
A5_DistSysCh1.ppt_INTRODUCTION TO DISTRIBUTED SYSTEMS
PPTX
UNIT 4 Total Quality Management .pptx
PPT
Total quality management ppt for engineering students
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Fundamentals of Mechanical Engineering.pptx
PDF
PPT on Performance Review to get promotions
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
Categorization of Factors Affecting Classification Algorithms Selection
communication and presentation skills 01
R24 SURVEYING LAB MANUAL for civil enggi
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
Information Storage and Retrieval Techniques Unit III
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
Current and future trends in Computer Vision.pptx
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
A5_DistSysCh1.ppt_INTRODUCTION TO DISTRIBUTED SYSTEMS
UNIT 4 Total Quality Management .pptx
Total quality management ppt for engineering students
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Fundamentals of Mechanical Engineering.pptx
PPT on Performance Review to get promotions
Mitigating Risks through Effective Management for Enhancing Organizational Pe...

Unit III Introduction to DWH.pdf

  • 1. Sanjivani Rural Education Society’s Sanjivani College of Engineering, Kopargaon-423 603 (An Autonomous Institute, Affiliated to Savitribai Phule Pune University, Pune) NACC ‘A’ Grade Accredited, ISO 9001:2015 Certified Department of Computer Engineering (NBA Accredited) Prof. S.A.Shivarkar Assistant Professor E-mail : shivarkarsandipcomp@sanjivani.org.in Contact No: 8275032712 Subject- Business Intelligence Unit-II: Data Warehouse
  • 2. Course Contents DEPARTMENT OF COMPUTER ENGINEERING, Sanjivani COE, Kopargaon 2 Unit-III :Data Warehouse Introduction: Data warehouse Modelling, data warehouse design, data-ware-house technology, Distributed data warehouse, and materialized view
  • 3. Data Warehouse  A data warehouse (DW) is a pool of data produced to support decision making; it is also a repository of current and historical data of potential interest to managers throughout the organization.  Data are usually structured to be available in a form ready for analytical processing activities (i.e., online analytical processing [OLAP], data mining, querying, reporting, and other decision support applications). A data warehouse is a subject- oriented, integrated, time-variant, nonvolatile collection of data in support of management's decision-making process.
  • 4. Data Warehouse Characteristics  Subject oriented  Integrated  Time variant (time series)  Nonvolatile  Web based  Relational/multidimensional  Client server  Real time  Include metadata
  • 5. Data Marts  A data mart is a subset of a data warehouse, typically consisting of a single subject area (e.g., marketing, operations).  A dependent data mart is a subset that is created directly from the data warehouse. It has the advantages of using a consistent data model and providing quality data.  Dependent data marts support the concept of a single enterprise-wide data model, but the data warehouse must be constructed first.  A dependent data mart ensures that the end user is viewing the same version of the data that is accessed
  • 6. Data Marts  A dependent data mart ensures that the end user is viewing the same version of the data that is accessed by all other data warehouse users.  The high cost of data warehouses limits their use to large companies.  As an alternative, many firms use a lower-cost, scaled-down version of a data warehouse referred to as an independent data mart. An independent data mart is a small warehouse designed for a strategic business unit (SBU) or a department, but its source is not an EDW.
  • 7. Operational Data Stores  An operational data store (ODS) provides a fairly recent form of customer information file (CIF).  This type of database is often used as an interim staging area for a data warehouse.  Unlike the static contents of a data warehouse, the contents of an ODS are updated throughout the course of business operations.  An ODS is used for short-term decisions involving mission-critical applications rather than for the medium- and long-term decisions associated with an EDW. An ODS is similar to short-term memory in that it stores only very recent information
  • 8. Enterprise Data Warehouses (EDW)  An enterprise data warehouse (EDW) is a large-scale data warehouse that is used across the enterprise for decision support.  It is the type of data warehouse that Isle of Capri developed, as described in the opening vignette.  The large-scale nature provides integration of data from many sources into a standard format for effective BI and decision support applications.  EDW are used to provide data for many types of DSS, including CRM, supply chain management (SCM) etc.
  • 9. Meta Data  Metadata are data about data.  Metadata describe the structure of and some meaning about data, thereby contributing to their effective or ineffective use.
  • 10. Importance of Meta Data  A documentation of the data warehouse structure: layout, logical views, dimensions, hierarchies, derived data, localization of any data mart;  A documentation of the data genealogy, obtained by tagging the data sources from which data were extracted and by describing any transformation performed on the data themselves;
  • 11. Importance of Meta Data  A list keeping the usage statistics of the data warehouse, by indicating how many accesses to a field or to a logical view have been performed;  A documentation of the general meaning of the data warehouse with respect to the application domain, by providing the definition of the terms utilized, and fully describing data properties, data ownership and loading policies.