SlideShare a Scribd company logo
12
Most read
13
Most read
15
Most read
DBMS
• Data Warehouse & Types of Data
• The term data warehouse is used to distinguish a database that is used for
business analysis (OLAP) rather than transaction processing (OLTP). While an OLTP
database contains current low-level data and is typically optimized for the
selection and retrieval of records, a data warehouse typically contains aggregated
historical data and is optimized for particular types of analyses, depending upon
the client applications.
• The contents of your data warehouse depends on the requirements of your users.
They should be able to tell you what type of data they want to view and at what
levels of aggregation they want to be able to view it.
Data warehouse will store these types of data:
• Historical data
• Derived data
• Metadata
METADATA : Metadata is a data about data. Metadata shows basic information
about data, which can make finding and working with specific instances of data
easier. Metadata increases the accuracy of searching and operating of data from
large amount of data
Definition of DBMS
• A Database Management System (DBMS) is a software system that is
designed to manage and organize data in a structured manner. It allows users
to create, modify, and query a database, as well as manage the security and
access controls for that database.
• DBMS provides an environment to store and retrieve the data in an efficient
manner.
• A database is a collection of interrelated data which helps in the efficient
retrieval, insertion, and deletion of data from the database and organizes the
data in the form of tables, views, schemas, reports, etc.
Key Features of DBMS
• Data storage and retrieval: A DBMS is responsible for
storing and retrieving data from the database, and can
provide various methods for searching and querying
the data.
• Concurrency control: A DBMS provides mechanisms
for controlling concurrent access to the database, to
ensure that multiple users can access the data without
conflicting with each other.
• Data integrity and security: A DBMS provides tools for
enforcing data integrity and security constraints, such
as constraints on the values of data and access controls
that restrict who can access the data.
• Backup and recovery: A DBMS provides mechanisms
for backing up and recovering the data in the event of a
system failure.
• DBMS can be classified into two types: Relational
Database Management System (RDBMS) and Non-
Relational Database Management System (NoSQL or
Non-SQL)
• RDBMS: Data is organized in the form of tables and
each table has a set of rows and columns. The data are
related to each other through primary and foreign
keys.
• NoSQL: Data is organized in the form of key-value
pairs, documents, graphs, or column-based. These are
designed to handle large-scale, high-performance
scenarios.
Properties of DBMS
• A transaction is a single logical unit of work that accesses and possibly modifies the contents of a
database. Transactions access data using read and write operations. In order to maintain
consistency in a database, before and after the transaction, certain properties are followed. These
are called ACID properties.
Atomicity: A transaction is either completely successful or completely unsuccessful. There is no
partial commit. Each transaction is considered as one unit and either runs to completion or is not
executed at all. It involves the following two operations.
Abort: If a transaction aborts, changes made to the database are not visible.
Commit: If a transaction commits, changes made are visible.
Atomicity is also known as the ‘All or nothing rule’.
Consistency: The database remains in a consistent state before and after a transaction is
committed. It refers to the correctness of a database.
Isolation: Transactions are isolated from each other, so that the changes made by one transaction
do not affect the other transactions until they are committed. This property ensures that the
execution of transactions concurrently will result in a state that is equivalent to a state achieved
these were executed serially in some order.
Durability: Once a transaction is committed, its changes are permanent and cannot be rolled back.
This property ensures that once the transaction has completed execution, the updates and
modifications to the database are stored in and written to disk and they persist even if a system
failure occurs. These updates now become permanent and are stored in non-volatile memory. The
effects of the transaction, thus, are never lost.
Data warehouse Architecture
• A data-warehouse is a heterogeneous collection of different data sources organized under a
unified schema (visual or diagrammatical representation). There are 2 approaches for
constructing data-warehouse: Top-down approach and Bottom-up approach
• Top-down approach:
• External Sources –
External source is a source from where data is collected irrespective of the type of
data. Data can be structured, semi structured and unstructured as well.
• Stage Area –
Since the data, extracted from the external sources does not follow a particular
format, so there is a need to validate this data to load into data warehouse. For this
purpose, it is recommended to use ETL tool.
– E(Extracted): Data is extracted from External data source.
– T(Transform): Data is transformed into the standard format.
– L(Load): Data is loaded into data warehouse after transforming it into the
standard format.
• Data-warehouse –
After cleansing of data, it is stored in the data warehouse as central repository. It
actually stores the meta data and the actual data gets stored in the data marts.
• Note that data warehouse stores the data in its purest form in this top-down
approach.
• Data Marts –
Data mart is also a part of storage component. It stores the information of
a particular function of an organization which is handled by single
authority. There can be as many number of data marts in an organization
depending upon the functions. We can also say that data mart contains
subset of the data stored in data warehouse.
• Data Mining –
The practice of analyzing the big data present in data warehouse is data
mining. It is used to find the hidden patterns that are present in the
database or in data warehouse with the help of algorithm of data mining.
This approach is defined by Inmon as – data warehouse as a central
repository for the complete organization and data marts are created from
it after the complete data warehouse has been created.
2. Bottom-up approach:
• First, the data is extracted from external sources (same as happens in
top-down approach).
• Then, the data go through the staging area (as explained above) and
loaded into data marts instead of data warehouse.
• A data mart is a data storage system that contains information
specific to an organization's business unit. It contains a small and
selected part of the data that the company stores in a larger storage
system.
• These data marts are then integrated into data warehouse.
• This approach is given by Kinball as – data marts are created first
and provides a thin view for analyses and data warehouse is created
after complete data marts have been created.
Category OLAP (Online Analytical Processing) OLTP (Online Transaction Processing)
Definition
It is well-known as an online database query
management system.
It is well-known as an online database modifying
system.
Data source Consists of historical data from various Databases. Consists of only operational current data.
Method used It makes use of a data warehouse.
It makes use of a standard database management
system (DBMS).
Application
It is subject-oriented. Used for Data Mining, Analytics,
Decisions making, etc.
It is application-oriented. Used for business tasks.
Normalized In an OLAP database, tables are not normalized. In an OLTP database, tables are normalized .
Usage of data
The data is used in planning, problem-solving, and
decision-making.
The data is used to perform day-to-day
fundamental operations.
Task
It provides a multi-dimensional view of
different business tasks.
It reveals a snapshot of present
business tasks.
Purpose
It serves the purpose to extract
information for analysis and decision-
making.
It serves the purpose to Insert, Update,
and Delete information from the
database.
Volume of data
A large amount of data is stored
typically in TB, PB
The size of the data is relatively small
as the historical data is archived in MB,
and GB.
Queries
Relatively slow as the amount of data
involved is large. Queries may take
hours.
Very Fast as the queries operate on 5%
of the data.
Update
The OLAP database is not often
updated. As a result, data integrity is
unaffected.
The data integrity constraint must be
maintained in an OLTP database.
Backup and Recovery
It only needs backup from time to time
as compared to OLTP.
The backup and recovery process is
maintained rigorously
Processing time
The processing of complex queries can
take a lengthy time.
It is comparatively fast in processing
because of simple and straightforward
queries.
Types of users
This data is generally managed by CEO,
MD, and GM.
This data is managed by clerksForex
and managers.
Operations Only read and rarely write operations. Both read and write operations.
Updates
With lengthy, scheduled batch
operations, data is refreshed on a
regular basis.
The user initiates data updates, which
are brief and quick.
Nature of audience
The process is focused on the
customer.
The process is focused on the market.
Database Design Design with a focus on the subject.
Design that is focused on the
application.
Productivity
Improves the efficiency of business
analysts.
Enhances the user’s productivity.
data warehousing need and characteristics. types of data w data warehouse architecture

More Related Content

PPTX
lec 4 Data warehouse course Advanced database.pptx
PPTX
module 1 DWDM (complete) chapter ppt.pptx
PPT
Datawarehousing
PPTX
Data Management
PPTX
ETL processes , Datawarehouse and Datamarts.pptx
PPTX
Data Warehouse
PPTX
DBMS.pptx
PPT
Datawarehousing
lec 4 Data warehouse course Advanced database.pptx
module 1 DWDM (complete) chapter ppt.pptx
Datawarehousing
Data Management
ETL processes , Datawarehouse and Datamarts.pptx
Data Warehouse
DBMS.pptx
Datawarehousing

Similar to data warehousing need and characteristics. types of data w data warehouse architecture (20)

PDF
data warehousing
PPTX
DATA WAREHOUSING.2.pptx
PPT
DW (1).ppt
PPTX
Data warehouse
PPT
Various Applications of Data Warehouse.ppt
PPTX
CST204 DBMSMODULE1 PPT (1).pptx
PPTX
Data Warehouse 1111111111111111111111111111.pptx
PPTX
Data warehouse physical design
PPTX
Data warehouse - Nivetha Durganathan
PPTX
E-Business Information System BBA AVI.pptx
PPT
Database management system lecture notes
PPTX
Module 1_Data Warehousing Fundamentals.pptx
PPTX
Data Mining & Data Warehousing
PPTX
Data warehouse introduction
PPT
Data ware housing - Introduction to data ware housing process.
PPTX
Data warehousing.pptx
PPTX
Data Warehouse for data analytics presentation
PPTX
Datawarehousing Terminology
PPTX
Data Mart Lake Ware.pptx
DOC
Data warehouse concepts
data warehousing
DATA WAREHOUSING.2.pptx
DW (1).ppt
Data warehouse
Various Applications of Data Warehouse.ppt
CST204 DBMSMODULE1 PPT (1).pptx
Data Warehouse 1111111111111111111111111111.pptx
Data warehouse physical design
Data warehouse - Nivetha Durganathan
E-Business Information System BBA AVI.pptx
Database management system lecture notes
Module 1_Data Warehousing Fundamentals.pptx
Data Mining & Data Warehousing
Data warehouse introduction
Data ware housing - Introduction to data ware housing process.
Data warehousing.pptx
Data Warehouse for data analytics presentation
Datawarehousing Terminology
Data Mart Lake Ware.pptx
Data warehouse concepts
Ad

Recently uploaded (20)

PPT
Quality review (1)_presentation of this 21
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Lecture1 pattern recognition............
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Database Infoormation System (DBIS).pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Mega Projects Data Mega Projects Data
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
Quality review (1)_presentation of this 21
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Business Ppt On Nestle.pptx huunnnhhgfvu
Fluorescence-microscope_Botany_detailed content
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Lecture1 pattern recognition............
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Database Infoormation System (DBIS).pptx
Miokarditis (Inflamasi pada Otot Jantung)
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Mega Projects Data Mega Projects Data
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
1_Introduction to advance data techniques.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
Ad

data warehousing need and characteristics. types of data w data warehouse architecture

  • 2. • Data Warehouse & Types of Data • The term data warehouse is used to distinguish a database that is used for business analysis (OLAP) rather than transaction processing (OLTP). While an OLTP database contains current low-level data and is typically optimized for the selection and retrieval of records, a data warehouse typically contains aggregated historical data and is optimized for particular types of analyses, depending upon the client applications. • The contents of your data warehouse depends on the requirements of your users. They should be able to tell you what type of data they want to view and at what levels of aggregation they want to be able to view it. Data warehouse will store these types of data: • Historical data • Derived data • Metadata METADATA : Metadata is a data about data. Metadata shows basic information about data, which can make finding and working with specific instances of data easier. Metadata increases the accuracy of searching and operating of data from large amount of data
  • 3. Definition of DBMS • A Database Management System (DBMS) is a software system that is designed to manage and organize data in a structured manner. It allows users to create, modify, and query a database, as well as manage the security and access controls for that database. • DBMS provides an environment to store and retrieve the data in an efficient manner. • A database is a collection of interrelated data which helps in the efficient retrieval, insertion, and deletion of data from the database and organizes the data in the form of tables, views, schemas, reports, etc.
  • 4. Key Features of DBMS • Data storage and retrieval: A DBMS is responsible for storing and retrieving data from the database, and can provide various methods for searching and querying the data. • Concurrency control: A DBMS provides mechanisms for controlling concurrent access to the database, to ensure that multiple users can access the data without conflicting with each other. • Data integrity and security: A DBMS provides tools for enforcing data integrity and security constraints, such as constraints on the values of data and access controls that restrict who can access the data. • Backup and recovery: A DBMS provides mechanisms for backing up and recovering the data in the event of a system failure.
  • 5. • DBMS can be classified into two types: Relational Database Management System (RDBMS) and Non- Relational Database Management System (NoSQL or Non-SQL) • RDBMS: Data is organized in the form of tables and each table has a set of rows and columns. The data are related to each other through primary and foreign keys. • NoSQL: Data is organized in the form of key-value pairs, documents, graphs, or column-based. These are designed to handle large-scale, high-performance scenarios.
  • 6. Properties of DBMS • A transaction is a single logical unit of work that accesses and possibly modifies the contents of a database. Transactions access data using read and write operations. In order to maintain consistency in a database, before and after the transaction, certain properties are followed. These are called ACID properties. Atomicity: A transaction is either completely successful or completely unsuccessful. There is no partial commit. Each transaction is considered as one unit and either runs to completion or is not executed at all. It involves the following two operations. Abort: If a transaction aborts, changes made to the database are not visible. Commit: If a transaction commits, changes made are visible. Atomicity is also known as the ‘All or nothing rule’. Consistency: The database remains in a consistent state before and after a transaction is committed. It refers to the correctness of a database. Isolation: Transactions are isolated from each other, so that the changes made by one transaction do not affect the other transactions until they are committed. This property ensures that the execution of transactions concurrently will result in a state that is equivalent to a state achieved these were executed serially in some order. Durability: Once a transaction is committed, its changes are permanent and cannot be rolled back. This property ensures that once the transaction has completed execution, the updates and modifications to the database are stored in and written to disk and they persist even if a system failure occurs. These updates now become permanent and are stored in non-volatile memory. The effects of the transaction, thus, are never lost.
  • 7. Data warehouse Architecture • A data-warehouse is a heterogeneous collection of different data sources organized under a unified schema (visual or diagrammatical representation). There are 2 approaches for constructing data-warehouse: Top-down approach and Bottom-up approach • Top-down approach:
  • 8. • External Sources – External source is a source from where data is collected irrespective of the type of data. Data can be structured, semi structured and unstructured as well. • Stage Area – Since the data, extracted from the external sources does not follow a particular format, so there is a need to validate this data to load into data warehouse. For this purpose, it is recommended to use ETL tool. – E(Extracted): Data is extracted from External data source. – T(Transform): Data is transformed into the standard format. – L(Load): Data is loaded into data warehouse after transforming it into the standard format. • Data-warehouse – After cleansing of data, it is stored in the data warehouse as central repository. It actually stores the meta data and the actual data gets stored in the data marts. • Note that data warehouse stores the data in its purest form in this top-down approach.
  • 9. • Data Marts – Data mart is also a part of storage component. It stores the information of a particular function of an organization which is handled by single authority. There can be as many number of data marts in an organization depending upon the functions. We can also say that data mart contains subset of the data stored in data warehouse. • Data Mining – The practice of analyzing the big data present in data warehouse is data mining. It is used to find the hidden patterns that are present in the database or in data warehouse with the help of algorithm of data mining. This approach is defined by Inmon as – data warehouse as a central repository for the complete organization and data marts are created from it after the complete data warehouse has been created.
  • 11. • First, the data is extracted from external sources (same as happens in top-down approach). • Then, the data go through the staging area (as explained above) and loaded into data marts instead of data warehouse. • A data mart is a data storage system that contains information specific to an organization's business unit. It contains a small and selected part of the data that the company stores in a larger storage system. • These data marts are then integrated into data warehouse. • This approach is given by Kinball as – data marts are created first and provides a thin view for analyses and data warehouse is created after complete data marts have been created.
  • 12. Category OLAP (Online Analytical Processing) OLTP (Online Transaction Processing) Definition It is well-known as an online database query management system. It is well-known as an online database modifying system. Data source Consists of historical data from various Databases. Consists of only operational current data. Method used It makes use of a data warehouse. It makes use of a standard database management system (DBMS). Application It is subject-oriented. Used for Data Mining, Analytics, Decisions making, etc. It is application-oriented. Used for business tasks. Normalized In an OLAP database, tables are not normalized. In an OLTP database, tables are normalized . Usage of data The data is used in planning, problem-solving, and decision-making. The data is used to perform day-to-day fundamental operations.
  • 13. Task It provides a multi-dimensional view of different business tasks. It reveals a snapshot of present business tasks. Purpose It serves the purpose to extract information for analysis and decision- making. It serves the purpose to Insert, Update, and Delete information from the database. Volume of data A large amount of data is stored typically in TB, PB The size of the data is relatively small as the historical data is archived in MB, and GB. Queries Relatively slow as the amount of data involved is large. Queries may take hours. Very Fast as the queries operate on 5% of the data. Update The OLAP database is not often updated. As a result, data integrity is unaffected. The data integrity constraint must be maintained in an OLTP database. Backup and Recovery It only needs backup from time to time as compared to OLTP. The backup and recovery process is maintained rigorously Processing time The processing of complex queries can take a lengthy time. It is comparatively fast in processing because of simple and straightforward queries.
  • 14. Types of users This data is generally managed by CEO, MD, and GM. This data is managed by clerksForex and managers. Operations Only read and rarely write operations. Both read and write operations. Updates With lengthy, scheduled batch operations, data is refreshed on a regular basis. The user initiates data updates, which are brief and quick. Nature of audience The process is focused on the customer. The process is focused on the market. Database Design Design with a focus on the subject. Design that is focused on the application. Productivity Improves the efficiency of business analysts. Enhances the user’s productivity.