SlideShare a Scribd company logo
INFORMATICA EASY LEARNING ONLINE TRAINING
Data Warehousing Concepts
 What is Data Warehousing?
 Dimensional Data Model
 Star Schema
 Snowflake Schema
 Slowly Changing Dimension
 Conceptual Data Model
 Logical Data Model
 Physical Data Model
 Conceptual, Logical, and Physical Data Model
 Data Integrity
 What is OLAP
 MOLAP, ROLAP, and HOLAP
What is Data Warehousing?
Different people have different definitions for a data warehouse. The most popular
definition came from Bill Inmon, who provided the following:
A data warehouse is a subject-oriented, integrated, time-variant and non-volatile
collection of data in support of management's decision making process.
A process of transforming data
into information and making it
available to users in a timely
enough manner to make a
difference
To summarize ...
• OLTP Systems
are used to “run”
a business
• The Data Warehouse
helps to “optimize” the
business
Corporate Data
It includes
• human resource data
• financial data
• facilities data
• sales data
• expenses on marketing data
• production planning cost
• manufacturing cost
• service delivery cost
• inventory management
• shipping and payment data
What is enterprise-wide corporate data?
How is the Business Intelligence in Retail Banking? Or Retail
Industry?
KPI’s
The KPI can be used as the performance measurement tool
(Key Performance Indicator)
The KPI’s in Retail Banking:
 The Total cash deposits held in a month
 The average annual deposit held
 Average number of deposits per retail bank growth
 Average withdrawals made by each depositor
 Ratio of active depositor or dormant depositor
 Average number of default borrowers in a year
 Average number of credit cards issued by the retail bank
 Rate of borrowing risk
 Rate of default risk
 Average number of customers served in a day
 Average number of closed bank accounts
KPI’s
The KPI can be used as the performance measurement tool
(Key Performance Indicator)
The KPI’s in Retail Industry:
• Sales compared to Budget & Target
• Sales compared to last year (or any other period)
• Wage cost recovery
• Average sale per customer/transaction
• Units per customer/transaction
• Sales per hour
• Sales & Gross Margin
KPI’s (Key Performance Indicator)
Examples of common departmental KPIs
Sales Growth
Analyze the pace at which your organization's
sales revenue is growing and use that
information in strategic decision-making
Marketing
Analyze the pace at which your organization's
sales revenue is growing and use that
information in strategic decision-making
Financial
Measures your organization's financial health
by analyzing readily available resources that
could be used to meet any short-term
obligations.
Data Warehousing
Data Warehousing Architecture
Data Warehousing Environment
• Duplicate data
• Inconsistent values
• Missing data
• Unexpected use of fields
• Impossible or wrong values
Data Quality
• Data-Type Constraints:
• Range Constraints:
• Mandatory Constraints:
• Unique Constraints:
• Set-Membership constraints:
• Foreign-key constraints: Regular expression patterns:
Validations for Data Cleansing
Views to build warehouse
• The top-down view
• The data source view
• The data warehouse view
• The business query view
What approach is better to design data warehouse?
Top Down Approach
Bottom Up Approach
Data Warehousing Design
• Requirement Gathering
• Physical Environment Setup
• Data Modeling
• ETL
• OLAP Cube Design
• Front End Development
• Report Development
• Performance Tuning
• Query Optimization
• Quality Assurance
• Rolling out to Production
• Production Maintenance
• Incremental Enhancements
Why Data Warehousing?
 Need to see daily, weekly, monthly, quarterly profit of each
store.
 Comparison of sales and profit on various time periods.
 Comparison of sales in various time bands of the day.
 Need to know which product has more demand on which
location?
 Need to study trend of sales by time period of the day over
the week, month, and year?
 On what day sales is higher?
Phases of Data Warehousing Project
1. Identify and collect requirements
 Need to see daily, weekly, monthly, quarterly profit of each store.
 Comparison of sales and profit on various time periods.
 Comparison of sales in various time bands of the day.
 Need to know which product has more demand on which location?
 Need to study trend of sales by time period of the day over the week, month, and year?
 On what day sales is higher?
Will be handled by business analyst and leads
Who collects the requirements?
Phases of Data Warehousing Project
2. Design the dimensional model
Pharmacy_Claims_Fact
Drug_Id (FK)
Org_Id (FK)
Practitioner_Id (FK)
Product_Id (FK)
Time_ID (FK)
Claim_status_Id (FK)
Provider_Id (FK)
Subscriber_id (FK)
Demographic_key (FK)
InsuranceType_Id (FK)
Incurred_Date
Claim_Date
Claim_Settled_Date
Days_Supply
Dispensing_Fee
Incentive_Savings_Amount
Incentive_Fee_Paid_Amount
Amount_Claimed
Amount_Paid
Amount_Pending
Amount_Adjusted
CoPayment_Amount
CoInsurance_Amount
Deductible
Refill_Indicator
Claim_Production_Key
Claim_Production_Txn_No
Status_Change_Date
Last_Record_Flag
Practitioner
Practitioner_Id
Practitioner_Name
Practitioner_Type
practioner_type_desc
Qualification
Specialisation
ssn
Medical_Assoc_Enroll_No
Organisation
Org_Id
Org_prod_id
Org_Name
Address
City
County
State
Zip
Industry_Classification
Subscriber
Subscriber_id
Subscriber_prod_key
Member_prod_key
Member_Name
Date_of_Birth
Subscriber_type
Address
City
County
State
Zip
Hobby1
Hobby2
Smoker_YN
Alcoholic_YN
Pre_Existing_Ailments
Demographics
Demographic_key
Age_group
Income_group
Race
Country_of_birth
Marital_status
Gender
Citizenship_status
Provider
Provider_Id
Provider_Name
Provider_Type
Address
City
County
State
Zip
Service_Area
Netwrok_Provider
Insurance_Type
InsuranceType_Id
InsuranceType_Name
InsuranceType_Desc
Product
Product_Id
Product_Name
Product_Category
LoB
Claim_Status
Claim_status_Id
Claim_Status_Reason
Claim_stat_catg
Time
Time_ID
Day
Week
Month
Quarter
Year
Season
Drugs
Drug_Id
Drug_Name_Generic
Drug_Name_Trade
National_Drug_Code
Drug_Description
Drug_Category
Formulary
Manufacturer
Data Model will be designed by Data Modelers
Phases of Data Warehousing Project
3. Create and Maintain the tables
Database will be maintained by DBA’s
Phases of Data Warehousing Project
4. Loading the data into Data Warehouse and Data Marts
Will be taken care by ETL Team
What is ETL?
Informatica is ETL application
Phases of Data Warehousing Project
5. Develop Reports / Dashboards
Will be taken care by Reporting Team
Phases of Data Warehousing Project
6. Testing ETL Mappings and Reports / Dashboards
Will be taken care by QA Department
7. Deploying to the Production and Maintaining by Production
Team
Will be taken care by Production Department
Where do we fit after learning this training?
Phases of Data Warehousing Project
Where do we fit after learning this training?
We can work as a
1. ETL Developer
2. ETL Administrator
3. ETL Tester
Data Modeling
What is Data Modeling?
• Data model defines relationships between
data
• Dimensional data model is most often used in
data warehousing systems.
• Data modeling is the process of learning about
the data.
Data modeling will be designed by data modelers
What is Dimensional Modeling?
• It help us store the data
Goals and benefits of Dimensional Modeling
• Faster Data retrieval
• Better Understandability
• Extensibility
It has 2 distinct categories
• Dimension and
• Measures
Scenarios of Dimensional Data Modeling
McDonald’s client:
I want to store information of how many burgers and fries are getting
sold per day from a single McDonald’s outlet.
what is dimension and what is a measure in this example
Step1: Identify the Dimensions
1.Food (ex: Burgers and fries)
2. Store (McDonald’s)
3. Some specific day
Step2: Identify the measures
Number of burgers/fries sold is a measure.
The Fact table captures the data that measures the organizations business
operations
Scenarios of Dimensional Data Modeling
Step3: Identify the attributes or properties of dimensions
KEY NAME
1 Burger
2 Fries
KEY NAME
1 Store 1
2 Store 2
... ...
KEY DAY
1 01 Jan 2012
2 02 Jan 2012
3 03 Jan 2012
... ...
Scenarios of Dimensional Data Modeling
Step 4: Identify the granularity of the measures
What is meant by "Granularity"?
Granularity refers to the lowest (or most granular) level of information
stored in any table
Scenarios of Dimensional Data Modeling
Step 5: History Preservation (Optional)
This can be solved by designing the dimension tables as "slowly changing
dimension".
Entities:
Entities are the things about which you want to store information.
For example: EMPLOYEE
Cardinalities:
Scenarios of Dimensional Data Modeling
The cardinality shows how much of one side of the relationship belongs to
how much of the other side of the relationship.
For example:
• How many customers belong to 1 sale?;
• How many sales belong to 1 customer?;
• How many sales take place in 1 shop?
Customers --> Sales; 1 customer can buy something several times
Sales --> Customers; 1 sale is always made by 1 customer at the time
Customers --> Products; 1 customer can buy multiple products
Products --> Customers; 1 product can be purchased by multiple customers
Scenarios of Dimensional Data Modeling for Banking
Scenarios of Dimensional Data Modeling for Retail Banking
Scenarios of Dimensional Data Modeling for Retail Banking
Event 1 - Set-up Banks and Branches
Event 2 - Create new Customer
Event 3 - Setup New Account
Event 4 - Issue Credit Card
Event 5 - Customer makes Deposit
Event 6 - Customer uses Card
Event 7 - Bank Issues Statement
Event 8 - Customer closes Account
Data Modeling
Data Modeling
Data Modeling
Types of OLAP Servers
We have four types of OLAP servers:
• Relational OLAP (ROLAP)
• Multidimensional OLAP (MOLAP)
• Hybrid OLAP (HOLAP)
• Specialized SQL Servers
OLTP v/s OLAP
OLTP Data Model
OLTP  OLAP
Snowflake Schema
Snowflake Schema
Star Schema
Informatica

More Related Content

PPT
BI in Retail sector
PPTX
Business intelligence
PDF
Business intelligence in retail
PDF
Bi retail wholesale-industry-presentation-0909275245
PPTX
Is Your Marketing Database "Model Ready"?
PPTX
Business Intelligence & Technology_Pharmaceutical BI
PDF
Rapid Optimization Application Development Using Excel and Solver
PDF
JARVIS:BI for FMCG Sales Managers
BI in Retail sector
Business intelligence
Business intelligence in retail
Bi retail wholesale-industry-presentation-0909275245
Is Your Marketing Database "Model Ready"?
Business Intelligence & Technology_Pharmaceutical BI
Rapid Optimization Application Development Using Excel and Solver
JARVIS:BI for FMCG Sales Managers

What's hot (19)

PDF
PPTX
Big data gaurav
PPTX
What are the benefits of a Product Information Management (PIM) system?
PPTX
Business Architecture - The Rise and Fall of Smart Retail
PDF
MR3 READINESS CHEAT SHEET
PDF
Product information management
PDF
R3 Consulting Product Information Management (PIM) webinar
PDF
Revolutionising Retail with Business Analytics
PPTX
Why marketers need a Product Information Management (PIM) Solution
PDF
A Road Map to Optimization: 3 Keys to Tying New Technology Rollouts To Busine...
PDF
Taking the Higher Ground in Category Management
PPTX
Product Information Management: Everything you wanted to know but were afraid...
PDF
How Product Information Management Solves Common Problems with Your Clients' ...
PDF
STEP (Stibo Enterprise Platform) Trailblazer
PPT
Demand management and customer service
PDF
Apparel retail software sap business one with i vend retail
PPTX
Zed-Sales™ - Channel Sales & Distribution Management System by Zed Axis Techn...
PPTX
The First Kilometre: Building a Back-End That Sets You Up For Success
PPTX
Assortment optimization based on consumer clustering and behavior modelling
Big data gaurav
What are the benefits of a Product Information Management (PIM) system?
Business Architecture - The Rise and Fall of Smart Retail
MR3 READINESS CHEAT SHEET
Product information management
R3 Consulting Product Information Management (PIM) webinar
Revolutionising Retail with Business Analytics
Why marketers need a Product Information Management (PIM) Solution
A Road Map to Optimization: 3 Keys to Tying New Technology Rollouts To Busine...
Taking the Higher Ground in Category Management
Product Information Management: Everything you wanted to know but were afraid...
How Product Information Management Solves Common Problems with Your Clients' ...
STEP (Stibo Enterprise Platform) Trailblazer
Demand management and customer service
Apparel retail software sap business one with i vend retail
Zed-Sales™ - Channel Sales & Distribution Management System by Zed Axis Techn...
The First Kilometre: Building a Back-End That Sets You Up For Success
Assortment optimization based on consumer clustering and behavior modelling
Ad

Similar to INFORMATICA EASY LEARNING ONLINE TRAINING (20)

PDF
Data warehousev2.1
DOCX
DATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docx
PDF
Business Intelligence Data Warehouse System
PDF
SALES_FORECASTING of sparkflows.pdf
PPT
Msbi by quontra us
PPTX
Sales Management Planning
PPSX
Data Refinement: The missing link between data collection and decisions
PDF
Retailers and Suppliers are Re-Tooling in Technology
PPTX
Trade smart case studies
PPTX
Trade smart case studies
PPTX
PDF
Assignment johnson
PPT
Intro to Data warehousing lecture 15
PDF
Strategies for Joint Business Planning Sessions
PPT
Benefits of a data warehouse presentation by Being topper
PPT
Big Data? Big Deal, Barclaycard
PPTX
Business requirements gathering for bi
PPT
Data mining in marketing
PPTX
Datawarehouse
PPTX
presentationofism-complete-1-100227093028-phpapp01.pptx
Data warehousev2.1
DATABASE SYSTEMS DEVELOPMENT & IMPLEMENTATION PLAN1DATABASE SYS.docx
Business Intelligence Data Warehouse System
SALES_FORECASTING of sparkflows.pdf
Msbi by quontra us
Sales Management Planning
Data Refinement: The missing link between data collection and decisions
Retailers and Suppliers are Re-Tooling in Technology
Trade smart case studies
Trade smart case studies
Assignment johnson
Intro to Data warehousing lecture 15
Strategies for Joint Business Planning Sessions
Benefits of a data warehouse presentation by Being topper
Big Data? Big Deal, Barclaycard
Business requirements gathering for bi
Data mining in marketing
Datawarehouse
presentationofism-complete-1-100227093028-phpapp01.pptx
Ad

More from ZaranTech LLC (20)

PDF
Comparison Between Artificial Intelligence, Machine Learning, and Deep Learning
PDF
6 Steps to Confirm Successful Workday Deployment
PDF
Business Benefits of Robotic Process Automation
PDF
RPA – UiPath Training & Certification Roadmap
PDF
Roles and Responsibilities of a DevOps Engineer
DOCX
Demand For Data Scientist
DOCX
Introduction To Data Science with Apache Spark
DOCX
10 Popular Hadoop Technical Interview Questions
PDF
SAP HANA Reporting - SAP HANA Tutorial
PDF
SAP HANA Native Application Development
DOCX
Qtp selenium Course Instructions & Installation Steps
PPTX
Introduction to NoSQL Databases | Hadoop Quick Introduction
PPT
Informatica Power Center - Workflow Manager
PDF
Informatica Data Modelling : Importance of Conceptual Models
DOC
Informatica Interview Questions & Answers
DOCX
CaseStudy - Business Analyst Project Objectives
PDF
All About Business Analyst Becoming a successful BA
PDF
SAP HANA Architecture Overview | SAP HANA Tutorial
PPT
Learning is Evolving | Enhance your skills with ZaranTech
PPT
What does a business analyst do?
Comparison Between Artificial Intelligence, Machine Learning, and Deep Learning
6 Steps to Confirm Successful Workday Deployment
Business Benefits of Robotic Process Automation
RPA – UiPath Training & Certification Roadmap
Roles and Responsibilities of a DevOps Engineer
Demand For Data Scientist
Introduction To Data Science with Apache Spark
10 Popular Hadoop Technical Interview Questions
SAP HANA Reporting - SAP HANA Tutorial
SAP HANA Native Application Development
Qtp selenium Course Instructions & Installation Steps
Introduction to NoSQL Databases | Hadoop Quick Introduction
Informatica Power Center - Workflow Manager
Informatica Data Modelling : Importance of Conceptual Models
Informatica Interview Questions & Answers
CaseStudy - Business Analyst Project Objectives
All About Business Analyst Becoming a successful BA
SAP HANA Architecture Overview | SAP HANA Tutorial
Learning is Evolving | Enhance your skills with ZaranTech
What does a business analyst do?

Recently uploaded (20)

PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
HVAC Specification 2024 according to central public works department
PDF
Trump Administration's workforce development strategy
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
Virtual and Augmented Reality in Current Scenario
PPTX
Introduction to Building Materials
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
advance database management system book.pdf
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
Computer Architecture Input Output Memory.pptx
PDF
Hazard Identification & Risk Assessment .pdf
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
HVAC Specification 2024 according to central public works department
Trump Administration's workforce development strategy
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
Paper A Mock Exam 9_ Attempt review.pdf.
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Virtual and Augmented Reality in Current Scenario
Introduction to Building Materials
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
AI-driven educational solutions for real-life interventions in the Philippine...
LDMMIA Reiki Yoga Finals Review Spring Summer
B.Sc. DS Unit 2 Software Engineering.pptx
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
advance database management system book.pdf
Weekly quiz Compilation Jan -July 25.pdf
Computer Architecture Input Output Memory.pptx
Hazard Identification & Risk Assessment .pdf
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα

INFORMATICA EASY LEARNING ONLINE TRAINING

  • 2. Data Warehousing Concepts  What is Data Warehousing?  Dimensional Data Model  Star Schema  Snowflake Schema  Slowly Changing Dimension  Conceptual Data Model  Logical Data Model  Physical Data Model  Conceptual, Logical, and Physical Data Model  Data Integrity  What is OLAP  MOLAP, ROLAP, and HOLAP
  • 3. What is Data Warehousing? Different people have different definitions for a data warehouse. The most popular definition came from Bill Inmon, who provided the following: A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process. A process of transforming data into information and making it available to users in a timely enough manner to make a difference
  • 4. To summarize ... • OLTP Systems are used to “run” a business • The Data Warehouse helps to “optimize” the business
  • 5. Corporate Data It includes • human resource data • financial data • facilities data • sales data • expenses on marketing data • production planning cost • manufacturing cost • service delivery cost • inventory management • shipping and payment data What is enterprise-wide corporate data? How is the Business Intelligence in Retail Banking? Or Retail Industry?
  • 6. KPI’s The KPI can be used as the performance measurement tool (Key Performance Indicator) The KPI’s in Retail Banking:  The Total cash deposits held in a month  The average annual deposit held  Average number of deposits per retail bank growth  Average withdrawals made by each depositor  Ratio of active depositor or dormant depositor  Average number of default borrowers in a year  Average number of credit cards issued by the retail bank  Rate of borrowing risk  Rate of default risk  Average number of customers served in a day  Average number of closed bank accounts
  • 7. KPI’s The KPI can be used as the performance measurement tool (Key Performance Indicator) The KPI’s in Retail Industry: • Sales compared to Budget & Target • Sales compared to last year (or any other period) • Wage cost recovery • Average sale per customer/transaction • Units per customer/transaction • Sales per hour • Sales & Gross Margin
  • 8. KPI’s (Key Performance Indicator) Examples of common departmental KPIs Sales Growth Analyze the pace at which your organization's sales revenue is growing and use that information in strategic decision-making Marketing Analyze the pace at which your organization's sales revenue is growing and use that information in strategic decision-making Financial Measures your organization's financial health by analyzing readily available resources that could be used to meet any short-term obligations.
  • 12. • Duplicate data • Inconsistent values • Missing data • Unexpected use of fields • Impossible or wrong values Data Quality • Data-Type Constraints: • Range Constraints: • Mandatory Constraints: • Unique Constraints: • Set-Membership constraints: • Foreign-key constraints: Regular expression patterns: Validations for Data Cleansing
  • 13. Views to build warehouse • The top-down view • The data source view • The data warehouse view • The business query view What approach is better to design data warehouse?
  • 16. Data Warehousing Design • Requirement Gathering • Physical Environment Setup • Data Modeling • ETL • OLAP Cube Design • Front End Development • Report Development • Performance Tuning • Query Optimization • Quality Assurance • Rolling out to Production • Production Maintenance • Incremental Enhancements
  • 17. Why Data Warehousing?  Need to see daily, weekly, monthly, quarterly profit of each store.  Comparison of sales and profit on various time periods.  Comparison of sales in various time bands of the day.  Need to know which product has more demand on which location?  Need to study trend of sales by time period of the day over the week, month, and year?  On what day sales is higher?
  • 18. Phases of Data Warehousing Project 1. Identify and collect requirements  Need to see daily, weekly, monthly, quarterly profit of each store.  Comparison of sales and profit on various time periods.  Comparison of sales in various time bands of the day.  Need to know which product has more demand on which location?  Need to study trend of sales by time period of the day over the week, month, and year?  On what day sales is higher? Will be handled by business analyst and leads Who collects the requirements?
  • 19. Phases of Data Warehousing Project 2. Design the dimensional model Pharmacy_Claims_Fact Drug_Id (FK) Org_Id (FK) Practitioner_Id (FK) Product_Id (FK) Time_ID (FK) Claim_status_Id (FK) Provider_Id (FK) Subscriber_id (FK) Demographic_key (FK) InsuranceType_Id (FK) Incurred_Date Claim_Date Claim_Settled_Date Days_Supply Dispensing_Fee Incentive_Savings_Amount Incentive_Fee_Paid_Amount Amount_Claimed Amount_Paid Amount_Pending Amount_Adjusted CoPayment_Amount CoInsurance_Amount Deductible Refill_Indicator Claim_Production_Key Claim_Production_Txn_No Status_Change_Date Last_Record_Flag Practitioner Practitioner_Id Practitioner_Name Practitioner_Type practioner_type_desc Qualification Specialisation ssn Medical_Assoc_Enroll_No Organisation Org_Id Org_prod_id Org_Name Address City County State Zip Industry_Classification Subscriber Subscriber_id Subscriber_prod_key Member_prod_key Member_Name Date_of_Birth Subscriber_type Address City County State Zip Hobby1 Hobby2 Smoker_YN Alcoholic_YN Pre_Existing_Ailments Demographics Demographic_key Age_group Income_group Race Country_of_birth Marital_status Gender Citizenship_status Provider Provider_Id Provider_Name Provider_Type Address City County State Zip Service_Area Netwrok_Provider Insurance_Type InsuranceType_Id InsuranceType_Name InsuranceType_Desc Product Product_Id Product_Name Product_Category LoB Claim_Status Claim_status_Id Claim_Status_Reason Claim_stat_catg Time Time_ID Day Week Month Quarter Year Season Drugs Drug_Id Drug_Name_Generic Drug_Name_Trade National_Drug_Code Drug_Description Drug_Category Formulary Manufacturer Data Model will be designed by Data Modelers
  • 20. Phases of Data Warehousing Project 3. Create and Maintain the tables Database will be maintained by DBA’s
  • 21. Phases of Data Warehousing Project 4. Loading the data into Data Warehouse and Data Marts Will be taken care by ETL Team
  • 22. What is ETL? Informatica is ETL application
  • 23. Phases of Data Warehousing Project 5. Develop Reports / Dashboards Will be taken care by Reporting Team
  • 24. Phases of Data Warehousing Project 6. Testing ETL Mappings and Reports / Dashboards Will be taken care by QA Department 7. Deploying to the Production and Maintaining by Production Team Will be taken care by Production Department Where do we fit after learning this training?
  • 25. Phases of Data Warehousing Project Where do we fit after learning this training? We can work as a 1. ETL Developer 2. ETL Administrator 3. ETL Tester
  • 27. What is Data Modeling? • Data model defines relationships between data • Dimensional data model is most often used in data warehousing systems. • Data modeling is the process of learning about the data. Data modeling will be designed by data modelers
  • 28. What is Dimensional Modeling? • It help us store the data Goals and benefits of Dimensional Modeling • Faster Data retrieval • Better Understandability • Extensibility It has 2 distinct categories • Dimension and • Measures
  • 29. Scenarios of Dimensional Data Modeling McDonald’s client: I want to store information of how many burgers and fries are getting sold per day from a single McDonald’s outlet. what is dimension and what is a measure in this example Step1: Identify the Dimensions 1.Food (ex: Burgers and fries) 2. Store (McDonald’s) 3. Some specific day Step2: Identify the measures Number of burgers/fries sold is a measure. The Fact table captures the data that measures the organizations business operations
  • 30. Scenarios of Dimensional Data Modeling Step3: Identify the attributes or properties of dimensions KEY NAME 1 Burger 2 Fries KEY NAME 1 Store 1 2 Store 2 ... ... KEY DAY 1 01 Jan 2012 2 02 Jan 2012 3 03 Jan 2012 ... ...
  • 31. Scenarios of Dimensional Data Modeling Step 4: Identify the granularity of the measures What is meant by "Granularity"? Granularity refers to the lowest (or most granular) level of information stored in any table
  • 32. Scenarios of Dimensional Data Modeling Step 5: History Preservation (Optional) This can be solved by designing the dimension tables as "slowly changing dimension". Entities: Entities are the things about which you want to store information. For example: EMPLOYEE
  • 33. Cardinalities: Scenarios of Dimensional Data Modeling The cardinality shows how much of one side of the relationship belongs to how much of the other side of the relationship. For example: • How many customers belong to 1 sale?; • How many sales belong to 1 customer?; • How many sales take place in 1 shop? Customers --> Sales; 1 customer can buy something several times Sales --> Customers; 1 sale is always made by 1 customer at the time Customers --> Products; 1 customer can buy multiple products Products --> Customers; 1 product can be purchased by multiple customers
  • 34. Scenarios of Dimensional Data Modeling for Banking
  • 35. Scenarios of Dimensional Data Modeling for Retail Banking
  • 36. Scenarios of Dimensional Data Modeling for Retail Banking Event 1 - Set-up Banks and Branches Event 2 - Create new Customer Event 3 - Setup New Account Event 4 - Issue Credit Card Event 5 - Customer makes Deposit Event 6 - Customer uses Card Event 7 - Bank Issues Statement Event 8 - Customer closes Account
  • 40. Types of OLAP Servers We have four types of OLAP servers: • Relational OLAP (ROLAP) • Multidimensional OLAP (MOLAP) • Hybrid OLAP (HOLAP) • Specialized SQL Servers