SlideShare a Scribd company logo
By:
Dr. Alice Mugisha
alice.mugisha@mak.ac.ug
Course Outline
 Week 1: Introduction to Business Intelligence , data warehouses and
Corporate Information
 Week 2: Data warehouse architectures
 Week 3: Data warehouse lifecycle and project management
 Week 4: Introduction to Dimensional Modeling
 Week 5: Building dimensional models – tutorial session
 Week 6: Test 1 and Data extraction, transformation and loading(ETL)
 Week 7: Online analytical processing (OLAP)
 Week 8: Data mining
 Week 9: Data warehousing and the Web
 Week 10: Data warehousing and the Web
 Week 11: Matching information to users
2
Introduction
 Data is an enterprise’s most valuable asset. To
fuel innovation, which fuels growth, an
enterprise must:
 Store every relevant data point about their
business
 Give data access to everyone who needs it
 Have the ability to analyze the data in
different ways
 Distill the data down to insights
3
Introduction
 Data
 Items that are the most elementary descriptions of
things, events, activities, and transactions
 May be internal or external
 Information
 Organized data that has meaning and value
 Knowledge
 Processed data or information that conveys
understanding or learning applicable to a problem
or activity
4
Introduction
 Over the years, storage and management of data
from various operational systems has become a
great challenge.
 Long-term strategic planning has become
increasingly important in the modern global
market.
 For this reason, companies have worked towards:
 Access to information at all levels
 Survival and prosperity in a competitive world.
 The focus of technology shifted from data input
and capture through the operational systems to
information access and availability for strategic
planning.
5
6
 Executives and managers need information to
formulate the business strategies, establish goals, set
objectives, and monitor results.
 Examples of business objectives include:
 Retain the present customer base
 Increase the customer base by 15% over the next 5
years
 Gain market share by 10% in the next 3 years
 Improve product quality levels in the top five product
groups
 Enhance customer service level in shipments
 Bring three new products to market in 2 years
 Increase sales by 15% in a given supermarket branch
7
What is a Data Warehouse (DWH)?
 A decision support database that is maintained
separately from the organization’s operational database
 Support information processing by providing a solid
platform of consolidated, historical data for analysis.
 “A data warehouse is a subject-oriented,
integrated, time-variant, historical and non-
volatile collection of data in support of
management’s decision-making process.”—
W. H. Inmon
8
9
Data Warehouse—Subject-Oriented
 Organized around major subjects, such as customer,
product, sales.
 Focusing on the modeling and analysis of data for
decision makers, not on daily operations or
transaction processing.
 Provide a simple and concise view around particular
subject issues by excluding data that are not useful in
the decision support process.
10
Data Warehouse—Integrated
 Constructed by integrating multiple,
heterogeneous data sources
 relational databases, flat files, on-line transaction
records
 Data cleaning and data integration techniques
are applied.
 Ensure consistency in naming conventions, encoding
structures, attribute measures, etc. among different
data sources
 E.g., Hotel price: currency, tax, breakfast covered, etc.
 When data is moved to the warehouse, it is converted.
11
Data Warehouse—Time Variant
 The time horizon for the data warehouse is
significantly longer than that of operational systems.
 Operational database: current value data.
 Data warehouse data: provide information from a
historical perspective (e.g., past 5-10 years)
 Every key structure in the data warehouse
 Contains an element of time, explicitly or implicitly
 But the key of operational data may or may not contain
“time element”.
12
Data Warehouse—Non-Volatile
 A physically separate store of data transformed
from the operational environment.
 Operational update of data does not occur in
the data warehouse environment.
 Does not require transaction processing, recovery,
and concurrency control mechanisms
 Requires only two operations in data accessing:
 initial loading of data and access of data.
Basic concept of data warehousing
 Take all the data from the operational systems
 Where necessary, include relevant data from
outside, such as industry benchmark indicators
 Integrate all the data from the various sources
 Remove inconsistencies and transform the
data
 Store the data in formats suitable for easy
access for decision making
13
A blend of technologies
14
15
 Although a simple concept, it involves different
functions: data extraction, the function of loading
the data, transforming the data, storing the data,
and providing user interfaces.
 Different technologies are, therefore, needed to
support these functions. Although many
technologies are in use, they all work together in a
data warehouse.
 The end result is the creation of a new computing
environment for the purpose of providing the
strategic information every enterprise needs
desperately.
 There are several vendor tools available in each of
these technologies. You do not have to build your
data warehouse from scratch.
16
Data Warehouse vs. traditional integration in
Heterogeneous DBMS
 Traditional heterogeneous DB integration:
 Build wrappers/mediators on top of heterogeneous databases
 Query driven approach
 When a query is posed to a client site, a meta-dictionary is used to
translate the query into queries appropriate for individual
heterogeneous sites involved, and the results are integrated into a
global answer set
 Complex information filtering, compete for resources
 Data warehouse: update-driven, high performance
 Information from heterogeneous sources is integrated in advance
and stored in warehouses for direct query and analysis
17
Data Warehouse (OLAP) vs. Operational DBMS (OLTP)
 OLTP (on-line transaction processing)
 Major task of traditional relational DBMS
 Day-to-day operations such as purchasing, inventory,
banking, manufacturing, payroll, registration,
accounting, etc.
 OLAP (on-line analytical processing)
 Major task of data warehouse system
 Data analysis and decision making
 Distinct features (OLTP vs. OLAP):
 User and system orientation: customer vs. market
 Data contents: current, detailed vs. historical, consolidated
 Database design: ER + application vs. star + subject
 View: current, local vs. evolutionary, integrated
 Access patterns: update vs. read-only but complex queries
18
OLTP vs. OLAP
OLTP OLAP
users clerk, IT professional knowledge worker
function day to day operations decision support
DB design application-oriented subject-oriented
data current, up-to-date
detailed, flat relational
isolated
historical,
summarized, multidimensional
integrated, consolidated
usage repetitive ad-hoc
access read/write
index/hash on prim. key
lots of scans
unit of work short, simple transaction complex query
# records accessed tens millions
#users thousands hundreds
DB size 1
00MB-GB 1
00GB-TB
metric transaction throughput query throughput, response
Successfully implemented data warehouses can bring
benefits to an organization as below:
1. Organisational Performance: - Operational Systems and
Data Warehouse
2. Simplify - Make Complex Data from Many Systems
available in one
3. Accuracy - Standardize and Cleanse
4. Business Value - Provide the Foundation for the Business
to Have Access to Information to Make Timely, Informed
Decisions
5. Direct Use - Non-IT personnel can make reports
19
20
Typical data warehouse queries (Case study: Banking
industry)
• Which corporate customers are above the average
account usage per month and how does this correlate to
their business?
• Who were the first hundred customers in Jan 2006 and
how does this list compare with the list for the previous
three years?
• What is the revenue by destination, by month, by
business unit, by region?
Complexities of Creating a Data Warehouse
 Incomplete errors
 Missing Fields
 Records or Fields That, by Design, are not Being
Recorded
 Incorrect errors
 Wrong Calculations, Aggregations
 Duplicate Records
 Wrong Information Entered into Source
System
 Inconsistency errors
 Inconsistent Use of Different Codes
21
Best Practices
 Data Warehousing is a process and not a
project
 Complete requirements and design
 Prototyping is key to business
understanding
 Utilizing proper aggregations and detailed
data
 A full iterative approach is essential
 Training is an on-going process
 Build data integrity checks into your system
22
23
•High investment
•The initial cost of building a data warehouse is very high
and ROI cannot easily be explained.
• Large storage
•Data warehouse stores useful historical data of an
enterprise.
• Maintenance of source systems
• If data source systems are not cleaned, we automatically
get dirty data into the data warehouse. Decision makers
using such data are likely to be misled and their decisions
may lead to loss of company revenue.
• Qualified staff
•Data warehouse building and maintenance requires
24
• New insights into
• Customer habits
• Developing new products
• Selling more products
• Cost savings and revenue increases
• Cross-selling of products
• Identify and target most profitable customers
Conclusions
 Building data warehouse is good but not
sufficient. The data in a data warehouse has to
be accessed by users and in order to access it; a
BI tool has to be used.
25
 All processes, techniques, and tools that support
business decision making based on information
technology.
 The approaches can range from a simple excel
spreadsheet to a major competitive intelligence
undertaking.
 e.g. data visualization, data mining, statistical
analysis using R, SPSS, etc
What is Business Intelligence (BI)?
26
Discussion Questions
 The current trends in hardware/software technology make
data warehousing feasible. Explain via some examples how
exactly technology trends do help.
 You are the IT Director of a nationwide insurance company.
Write a memo to the Executive Vice President explaining the
types of opportunities that can be realized with readily
available strategic information.
 For an airlines company, how can strategic information
increase the number of frequent flyers? Discuss giving
specific details.
 You are a Senior Analyst in the IT department of a company
manufacturing automobile parts. The marketing VP is
complaining about the poor response by IT in providing
strategic information. Explain the reasons for the problems
and why a data warehouse would be the only viable solution.
27

More Related Content

PPTX
CHAPTER 1 - Introdution to Datawarehousing.pptx
PPTX
Seminor_on_data_data-warehousing-ppt.pptx
PPT
Chapter 2
PPT
Datawarehousing
DOC
Data Mining
PPTX
DWDM Unit 1 (1).pptx
PPTX
Hair_PPT_Ch02.pptx marketing analytics12
PPT
Data wirehouse
CHAPTER 1 - Introdution to Datawarehousing.pptx
Seminor_on_data_data-warehousing-ppt.pptx
Chapter 2
Datawarehousing
Data Mining
DWDM Unit 1 (1).pptx
Hair_PPT_Ch02.pptx marketing analytics12
Data wirehouse

Similar to Introduction to Business Intelligence and Data warehousing - ppt (20)

PPTX
Data warehousing Concepts and Design.pptx
PPTX
introduction & conceptsdatawarehousing.pptx
PPT
Data warehouse
PPT
Data warehousing and online analytical processing
PPT
20IT501_DWDM_PPT_Unit_I.ppt
PPTX
Data warehouse
PPTX
DATA WAREHOUSING
PPT
Dataware housing
PDF
Data warehousing
PDF
Erp and related technologies
DOC
Data warehouse concepts
PPTX
Manish tripathi-ea-dw-bi
 
PPTX
presentationofism-complete-1-100227093028-phpapp01.pptx
DOCX
Unit 1
PPT
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
PDF
Data warehousev2.1
PDF
6566tyyht656ty55hyhghghghghghg04OLAP.pdf
PPT
2. olap warehouse
PPTX
Data warehouse-complete-1-100227093028-phpapp01.pptx
DOC
Data mining notes
Data warehousing Concepts and Design.pptx
introduction & conceptsdatawarehousing.pptx
Data warehouse
Data warehousing and online analytical processing
20IT501_DWDM_PPT_Unit_I.ppt
Data warehouse
DATA WAREHOUSING
Dataware housing
Data warehousing
Erp and related technologies
Data warehouse concepts
Manish tripathi-ea-dw-bi
 
presentationofism-complete-1-100227093028-phpapp01.pptx
Unit 1
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Data warehousev2.1
6566tyyht656ty55hyhghghghghghg04OLAP.pdf
2. olap warehouse
Data warehouse-complete-1-100227093028-phpapp01.pptx
Data mining notes
Ad

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Electronic commerce courselecture one. Pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Big Data Technologies - Introduction.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Spectroscopy.pptx food analysis technology
PDF
A comparative analysis of optical character recognition models for extracting...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation theory and applications.pdf
PPTX
sap open course for s4hana steps from ECC to s4
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Empathic Computing: Creating Shared Understanding
Electronic commerce courselecture one. Pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Digital-Transformation-Roadmap-for-Companies.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Review of recent advances in non-invasive hemoglobin estimation
MIND Revenue Release Quarter 2 2025 Press Release
Big Data Technologies - Introduction.pptx
20250228 LYD VKU AI Blended-Learning.pptx
The AUB Centre for AI in Media Proposal.docx
Spectroscopy.pptx food analysis technology
A comparative analysis of optical character recognition models for extracting...
“AI and Expert System Decision Support & Business Intelligence Systems”
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation theory and applications.pdf
sap open course for s4hana steps from ECC to s4
Ad

Introduction to Business Intelligence and Data warehousing - ppt

  • 2. Course Outline  Week 1: Introduction to Business Intelligence , data warehouses and Corporate Information  Week 2: Data warehouse architectures  Week 3: Data warehouse lifecycle and project management  Week 4: Introduction to Dimensional Modeling  Week 5: Building dimensional models – tutorial session  Week 6: Test 1 and Data extraction, transformation and loading(ETL)  Week 7: Online analytical processing (OLAP)  Week 8: Data mining  Week 9: Data warehousing and the Web  Week 10: Data warehousing and the Web  Week 11: Matching information to users 2
  • 3. Introduction  Data is an enterprise’s most valuable asset. To fuel innovation, which fuels growth, an enterprise must:  Store every relevant data point about their business  Give data access to everyone who needs it  Have the ability to analyze the data in different ways  Distill the data down to insights 3
  • 4. Introduction  Data  Items that are the most elementary descriptions of things, events, activities, and transactions  May be internal or external  Information  Organized data that has meaning and value  Knowledge  Processed data or information that conveys understanding or learning applicable to a problem or activity 4
  • 5. Introduction  Over the years, storage and management of data from various operational systems has become a great challenge.  Long-term strategic planning has become increasingly important in the modern global market.  For this reason, companies have worked towards:  Access to information at all levels  Survival and prosperity in a competitive world.  The focus of technology shifted from data input and capture through the operational systems to information access and availability for strategic planning. 5
  • 6. 6  Executives and managers need information to formulate the business strategies, establish goals, set objectives, and monitor results.  Examples of business objectives include:  Retain the present customer base  Increase the customer base by 15% over the next 5 years  Gain market share by 10% in the next 3 years  Improve product quality levels in the top five product groups  Enhance customer service level in shipments  Bring three new products to market in 2 years  Increase sales by 15% in a given supermarket branch
  • 7. 7
  • 8. What is a Data Warehouse (DWH)?  A decision support database that is maintained separately from the organization’s operational database  Support information processing by providing a solid platform of consolidated, historical data for analysis.  “A data warehouse is a subject-oriented, integrated, time-variant, historical and non- volatile collection of data in support of management’s decision-making process.”— W. H. Inmon 8
  • 9. 9 Data Warehouse—Subject-Oriented  Organized around major subjects, such as customer, product, sales.  Focusing on the modeling and analysis of data for decision makers, not on daily operations or transaction processing.  Provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process.
  • 10. 10 Data Warehouse—Integrated  Constructed by integrating multiple, heterogeneous data sources  relational databases, flat files, on-line transaction records  Data cleaning and data integration techniques are applied.  Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sources  E.g., Hotel price: currency, tax, breakfast covered, etc.  When data is moved to the warehouse, it is converted.
  • 11. 11 Data Warehouse—Time Variant  The time horizon for the data warehouse is significantly longer than that of operational systems.  Operational database: current value data.  Data warehouse data: provide information from a historical perspective (e.g., past 5-10 years)  Every key structure in the data warehouse  Contains an element of time, explicitly or implicitly  But the key of operational data may or may not contain “time element”.
  • 12. 12 Data Warehouse—Non-Volatile  A physically separate store of data transformed from the operational environment.  Operational update of data does not occur in the data warehouse environment.  Does not require transaction processing, recovery, and concurrency control mechanisms  Requires only two operations in data accessing:  initial loading of data and access of data.
  • 13. Basic concept of data warehousing  Take all the data from the operational systems  Where necessary, include relevant data from outside, such as industry benchmark indicators  Integrate all the data from the various sources  Remove inconsistencies and transform the data  Store the data in formats suitable for easy access for decision making 13
  • 14. A blend of technologies 14
  • 15. 15  Although a simple concept, it involves different functions: data extraction, the function of loading the data, transforming the data, storing the data, and providing user interfaces.  Different technologies are, therefore, needed to support these functions. Although many technologies are in use, they all work together in a data warehouse.  The end result is the creation of a new computing environment for the purpose of providing the strategic information every enterprise needs desperately.  There are several vendor tools available in each of these technologies. You do not have to build your data warehouse from scratch.
  • 16. 16 Data Warehouse vs. traditional integration in Heterogeneous DBMS  Traditional heterogeneous DB integration:  Build wrappers/mediators on top of heterogeneous databases  Query driven approach  When a query is posed to a client site, a meta-dictionary is used to translate the query into queries appropriate for individual heterogeneous sites involved, and the results are integrated into a global answer set  Complex information filtering, compete for resources  Data warehouse: update-driven, high performance  Information from heterogeneous sources is integrated in advance and stored in warehouses for direct query and analysis
  • 17. 17 Data Warehouse (OLAP) vs. Operational DBMS (OLTP)  OLTP (on-line transaction processing)  Major task of traditional relational DBMS  Day-to-day operations such as purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc.  OLAP (on-line analytical processing)  Major task of data warehouse system  Data analysis and decision making  Distinct features (OLTP vs. OLAP):  User and system orientation: customer vs. market  Data contents: current, detailed vs. historical, consolidated  Database design: ER + application vs. star + subject  View: current, local vs. evolutionary, integrated  Access patterns: update vs. read-only but complex queries
  • 18. 18 OLTP vs. OLAP OLTP OLAP users clerk, IT professional knowledge worker function day to day operations decision support DB design application-oriented subject-oriented data current, up-to-date detailed, flat relational isolated historical, summarized, multidimensional integrated, consolidated usage repetitive ad-hoc access read/write index/hash on prim. key lots of scans unit of work short, simple transaction complex query # records accessed tens millions #users thousands hundreds DB size 1 00MB-GB 1 00GB-TB metric transaction throughput query throughput, response
  • 19. Successfully implemented data warehouses can bring benefits to an organization as below: 1. Organisational Performance: - Operational Systems and Data Warehouse 2. Simplify - Make Complex Data from Many Systems available in one 3. Accuracy - Standardize and Cleanse 4. Business Value - Provide the Foundation for the Business to Have Access to Information to Make Timely, Informed Decisions 5. Direct Use - Non-IT personnel can make reports 19
  • 20. 20 Typical data warehouse queries (Case study: Banking industry) • Which corporate customers are above the average account usage per month and how does this correlate to their business? • Who were the first hundred customers in Jan 2006 and how does this list compare with the list for the previous three years? • What is the revenue by destination, by month, by business unit, by region?
  • 21. Complexities of Creating a Data Warehouse  Incomplete errors  Missing Fields  Records or Fields That, by Design, are not Being Recorded  Incorrect errors  Wrong Calculations, Aggregations  Duplicate Records  Wrong Information Entered into Source System  Inconsistency errors  Inconsistent Use of Different Codes 21
  • 22. Best Practices  Data Warehousing is a process and not a project  Complete requirements and design  Prototyping is key to business understanding  Utilizing proper aggregations and detailed data  A full iterative approach is essential  Training is an on-going process  Build data integrity checks into your system 22
  • 23. 23 •High investment •The initial cost of building a data warehouse is very high and ROI cannot easily be explained. • Large storage •Data warehouse stores useful historical data of an enterprise. • Maintenance of source systems • If data source systems are not cleaned, we automatically get dirty data into the data warehouse. Decision makers using such data are likely to be misled and their decisions may lead to loss of company revenue. • Qualified staff •Data warehouse building and maintenance requires
  • 24. 24 • New insights into • Customer habits • Developing new products • Selling more products • Cost savings and revenue increases • Cross-selling of products • Identify and target most profitable customers
  • 25. Conclusions  Building data warehouse is good but not sufficient. The data in a data warehouse has to be accessed by users and in order to access it; a BI tool has to be used. 25
  • 26.  All processes, techniques, and tools that support business decision making based on information technology.  The approaches can range from a simple excel spreadsheet to a major competitive intelligence undertaking.  e.g. data visualization, data mining, statistical analysis using R, SPSS, etc What is Business Intelligence (BI)? 26
  • 27. Discussion Questions  The current trends in hardware/software technology make data warehousing feasible. Explain via some examples how exactly technology trends do help.  You are the IT Director of a nationwide insurance company. Write a memo to the Executive Vice President explaining the types of opportunities that can be realized with readily available strategic information.  For an airlines company, how can strategic information increase the number of frequent flyers? Discuss giving specific details.  You are a Senior Analyst in the IT department of a company manufacturing automobile parts. The marketing VP is complaining about the poor response by IT in providing strategic information. Explain the reasons for the problems and why a data warehouse would be the only viable solution. 27

Editor's Notes