SlideShare a Scribd company logo
Data Mining is the mining, or discovery, of new information in terms of patterns or rules
from vast amounts of data. T1o be useful, data mining must be carried out efficiently on large
files and databases. Eg: using neural network , some mathematical algorithm to mine on data
and analyzing data. That result extracting of data increasing productivity and efficiency..
eg: socail network: facebook, linked in, twitter. People as a data . Extracting data for valuable
busineess resource
Goals of Data Mining
ļ‚· Prediction: Determine how certain attributes will behave in the future. For example,
how much sales volume a store will generate in a given period.
ļ‚· Identification: Identify patterns in data. For example, newly wed couples tend to
spend more money buying furniture.
ļ‚· Classification: Partition data into classes. For example, customers can be classified
into different categories with different behavior in shopping.
Eg:customer in supermarket can be categorized into discount seeking, shoppers,
shopper in rush, loyal regular shopper, infrequent shopper.
ļ‚· Optimization: Optimize the use of limited resources such as time, space, money or
materials. For example, how to best use advertising to maximize profits (sales).
Types of Knowledge Discovered during Data Mining
ļ‚· Association rules: For example, when a male shopper buys a new car, he is likely to
buy a car CD.
ļ‚· Classification hierarchies: For example, mutual funds may be classified into three
categories: growth, income and stable. In banking application, customer applying for
credit card can be classified as risk,fail risk and good risk.
ļ‚· Sequence patterns: Sequence patterns are temporal associations. For example, if
mortgage interest rate drops, within six months period the sales of houses will
increase by certain percentage.
ļ‚· Patterns within time series: such as stock price data behavior in time.
ļ‚· Detection of Similarity, or segmentation (Clustering): A population of events or
item can be partitioned into similar set of elements .For example, health data may
indicate similarity among subgroups of people.
1 http://sumanastani.com.np
s
Applications of Data Mining
ļ‚· Marketing
ļ‚· Finance
ļ‚· Manufacturing
ļ‚· Health Care
Commercial Data Mining Tools
Intelligent Miner from IBM applies classification and association rules to detect rules and
patterns and make predictions.
Enterprise Miner from SAS applies decision trees, neural nets, clustering techniques, statistics,
association rules.
Many new tools are coming out on the market in recent years, making data mining a very
active research and development area.
What is 'Data Warehousing'
Data warehousing is the electronic storage of a large amount of information by a business that
help in future decision making. Warehoused data must be stored in a manner that is secure,
reliable, easy to retrieve and easy to manage
A data warehouse is a:
ļ‚· subject-oriented
ļ‚· integrated
ļ‚· timevarying
ļ‚· non-volatilecollection of data in support of the management's decision-making
process.
A data warehouse is a centralized repository that stores data from multiple
information sources and transforms them into a common, multidimensional data
model for efficient querying and analysis.
DATAWARE HOUSE VS DATABASE
Database
1.Database are collection of data organized in some way.
2.Used for Online Transactional Processing (OLTP) include insert, delete, update and
other queries. but can be used for other purposes such as Data Warehousing. This records
the data from the user for history.
3.The tables and joins are complex since they are normalized (for RDMS). This is done to
reduce redundant data and to save storage space.
4. Database Desigh :Entity – Relational modeling techniques are used for RDMS database
design.
5.Optimized for write operation.
6.Performance is low for analysis queries.
7.Data are volatile: changes frequently
Data Warehouse
1.DataWare house is an effective collection of data that facilitates reporting and analysis
for future decision.
2.Used for Online Analytical Processing (OLAP). This reads the historical data for the
Users for business decisions.
3.The Tables and joins are simple since they are de-normalized. This is done to reduce the
response time for analytical queries.
4.Database Design : Data – Modeling techniques are used for the Data Warehouse design.
5.Optimized for read operations.
6.High performance for analytical queries.
7.Data are non-volatile: changes less often.
Characterstics
subject-oriented : A data warehouse can be used to analyze a particular subject area. For
example, ā€œsalesā€ can be a particular subject.
integrated : A data warehouse integrates data from multiple data sources. For example,
source A and source B may have different ways of identifying a product, but in a data
warehouse, there will be only a single way of identifying a product.
It is consistent in the way that data from several sources is extracted and transformed. For
example, coding conventions are standardized: M _ male, F _ female.
Timevarying : Historical data is kept in a data warehouse. For example, one can retrieve data
from 3 months, 6 months, 12 months, or even older data from a data warehouse. This
contrasts with a transactions system, where often only the most recent data is kept. For
example, a transaction system may hold the most recent address of a customer, where a data
warehouse can hold all addresses associated with a customer.
Data are organized by various time-periods (e.g. months).
Non-volatile : Once data is in the data warehouse, it will not change. So, historical data in a
data warehouse should never be altered.
collection of data in support of the management's decision-making process.
A data warehouse is a centralized repository that stores data from multiple information
sources and transforms them into a common, multidimensional data model for efficient
querying and analysis.
Other extra charcter:
1.Client Server Architecture
2.Transperency
3.Flexible reporting
4.Multi user support
Function of Data Ware house.(RDSSSD)
1. Roll Up: Data are summarized with generalization like weekly=>monthly=>annualy
2. Drill Down: Complement of roll up. Opposite
3. Pivot : cross tabulation(roatation) can be performed
4. Slice and Dice : projection operation is performed on the dimension
5. Sorting : data is sorted in some order(ascend/descend)
6. Selection: data is available by value or range
7. Derived computed attributes: Attributes are composed by operation on stored derived
value.

More Related Content

DOCX
data mining and data warehousing
ODP
Data mining
PPT
Data Warehouse and Data Mining
PPTX
Application of data mining
PPT
Datawarehousing
PDF
Data mining 1 - Introduction (cheat sheet - printable)
PDF
Data Mining & Data Warehousing Lecture Notes
PPT
Data mining by_ashok
data mining and data warehousing
Data mining
Data Warehouse and Data Mining
Application of data mining
Datawarehousing
Data mining 1 - Introduction (cheat sheet - printable)
Data Mining & Data Warehousing Lecture Notes
Data mining by_ashok

What's hot (19)

PPT
Data mining & data warehousing
PPTX
introduction to data warehousing and mining
PPTX
Data warehousing and data mining
PPTX
Data warehouse
PPT
Data mining
PPT
Dataware housing
Ā 
PDF
Data mining 2 - Data warehouse (cheat sheet - printable)
PPT
Dw Concepts
PPTX
Data warehouse and data mining
PPT
Introduction to Data Mining
PPT
Data Mining and Data Warehousing
PPT
Difference between data warehouse and data mining
PPTX
What is Data mining? Data mining Presentation
PPTX
Data mining
PPT
Data Mining Concepts
PPTX
Data Mining & Applications
PPT
DMML1_overview.ppt
Ā 
PPT
Databases
Ā 
PPTX
Data mining introduction
Data mining & data warehousing
introduction to data warehousing and mining
Data warehousing and data mining
Data warehouse
Data mining
Dataware housing
Ā 
Data mining 2 - Data warehouse (cheat sheet - printable)
Dw Concepts
Data warehouse and data mining
Introduction to Data Mining
Data Mining and Data Warehousing
Difference between data warehouse and data mining
What is Data mining? Data mining Presentation
Data mining
Data Mining Concepts
Data Mining & Applications
DMML1_overview.ppt
Ā 
Databases
Ā 
Data mining introduction
Ad

Similar to Data miningvs datawarehouse (20)

PPTX
DATA MINING AND WAREHOUSING_MBA_MIS_BMB208
DOCX
Abstract
PPTX
Data warehouse
DOC
Data mining notes
PPTX
Data warehouse
Ā 
PPTX
Modern trends in information systems
PPTX
DAtawarehousing and datamining in IT ind
PDF
data warehousing and data mining (1).pdf
PDF
TOPIC 9 data warehousing and data mining.pdf
PDF
Data warehousing interview questions
DOCX
notes_dmdw_chap1.docx
PDF
Introduction to Data Warehouse
PPT
Datawarehousing
Ā 
PPTX
Data Warehousing
PPTX
WEEK 1 - Data mining and Warehouse.pptx
PPTX
Business Intelligence Module 3_Datawarehousing.pptx
PPTX
DATA WAREHOUSING
DOCX
Unit 1
PPTX
MIS and Business Functions, TPS/DSS/ESS, MIS and Business Processes, Impact o...
PPTX
Data warehousing
DATA MINING AND WAREHOUSING_MBA_MIS_BMB208
Abstract
Data warehouse
Data mining notes
Data warehouse
Ā 
Modern trends in information systems
DAtawarehousing and datamining in IT ind
data warehousing and data mining (1).pdf
TOPIC 9 data warehousing and data mining.pdf
Data warehousing interview questions
notes_dmdw_chap1.docx
Introduction to Data Warehouse
Datawarehousing
Ā 
Data Warehousing
WEEK 1 - Data mining and Warehouse.pptx
Business Intelligence Module 3_Datawarehousing.pptx
DATA WAREHOUSING
Unit 1
MIS and Business Functions, TPS/DSS/ESS, MIS and Business Processes, Impact o...
Data warehousing
Ad

Recently uploaded (20)

PDF
.pdf is not working space design for the following data for the following dat...
PDF
Lecture1 pattern recognition............
PPTX
Business Acumen Training GuidePresentation.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
1_Introduction to advance data techniques.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
annual-report-2024-2025 original latest.
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
.pdf is not working space design for the following data for the following dat...
Lecture1 pattern recognition............
Business Acumen Training GuidePresentation.pptx
Quality review (1)_presentation of this 21
1_Introduction to advance data techniques.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Miokarditis (Inflamasi pada Otot Jantung)
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Introduction to Knowledge Engineering Part 1
annual-report-2024-2025 original latest.
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Business Ppt On Nestle.pptx huunnnhhgfvu
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Fluorescence-microscope_Botany_detailed content
Supervised vs unsupervised machine learning algorithms
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Business Analytics and business intelligence.pdf
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Clinical guidelines as a resource for EBP(1).pdf

Data miningvs datawarehouse

  • 1. Data Mining is the mining, or discovery, of new information in terms of patterns or rules from vast amounts of data. T1o be useful, data mining must be carried out efficiently on large files and databases. Eg: using neural network , some mathematical algorithm to mine on data and analyzing data. That result extracting of data increasing productivity and efficiency.. eg: socail network: facebook, linked in, twitter. People as a data . Extracting data for valuable busineess resource Goals of Data Mining ļ‚· Prediction: Determine how certain attributes will behave in the future. For example, how much sales volume a store will generate in a given period. ļ‚· Identification: Identify patterns in data. For example, newly wed couples tend to spend more money buying furniture. ļ‚· Classification: Partition data into classes. For example, customers can be classified into different categories with different behavior in shopping. Eg:customer in supermarket can be categorized into discount seeking, shoppers, shopper in rush, loyal regular shopper, infrequent shopper. ļ‚· Optimization: Optimize the use of limited resources such as time, space, money or materials. For example, how to best use advertising to maximize profits (sales). Types of Knowledge Discovered during Data Mining ļ‚· Association rules: For example, when a male shopper buys a new car, he is likely to buy a car CD. ļ‚· Classification hierarchies: For example, mutual funds may be classified into three categories: growth, income and stable. In banking application, customer applying for credit card can be classified as risk,fail risk and good risk. ļ‚· Sequence patterns: Sequence patterns are temporal associations. For example, if mortgage interest rate drops, within six months period the sales of houses will increase by certain percentage. ļ‚· Patterns within time series: such as stock price data behavior in time. ļ‚· Detection of Similarity, or segmentation (Clustering): A population of events or item can be partitioned into similar set of elements .For example, health data may indicate similarity among subgroups of people. 1 http://sumanastani.com.np s
  • 2. Applications of Data Mining ļ‚· Marketing ļ‚· Finance ļ‚· Manufacturing ļ‚· Health Care Commercial Data Mining Tools Intelligent Miner from IBM applies classification and association rules to detect rules and patterns and make predictions. Enterprise Miner from SAS applies decision trees, neural nets, clustering techniques, statistics, association rules. Many new tools are coming out on the market in recent years, making data mining a very active research and development area. What is 'Data Warehousing' Data warehousing is the electronic storage of a large amount of information by a business that help in future decision making. Warehoused data must be stored in a manner that is secure, reliable, easy to retrieve and easy to manage A data warehouse is a: ļ‚· subject-oriented ļ‚· integrated ļ‚· timevarying ļ‚· non-volatilecollection of data in support of the management's decision-making process. A data warehouse is a centralized repository that stores data from multiple information sources and transforms them into a common, multidimensional data model for efficient querying and analysis.
  • 3. DATAWARE HOUSE VS DATABASE Database 1.Database are collection of data organized in some way. 2.Used for Online Transactional Processing (OLTP) include insert, delete, update and other queries. but can be used for other purposes such as Data Warehousing. This records the data from the user for history. 3.The tables and joins are complex since they are normalized (for RDMS). This is done to reduce redundant data and to save storage space. 4. Database Desigh :Entity – Relational modeling techniques are used for RDMS database design. 5.Optimized for write operation. 6.Performance is low for analysis queries. 7.Data are volatile: changes frequently Data Warehouse 1.DataWare house is an effective collection of data that facilitates reporting and analysis for future decision. 2.Used for Online Analytical Processing (OLAP). This reads the historical data for the Users for business decisions. 3.The Tables and joins are simple since they are de-normalized. This is done to reduce the response time for analytical queries. 4.Database Design : Data – Modeling techniques are used for the Data Warehouse design. 5.Optimized for read operations. 6.High performance for analytical queries. 7.Data are non-volatile: changes less often. Characterstics subject-oriented : A data warehouse can be used to analyze a particular subject area. For example, ā€œsalesā€ can be a particular subject. integrated : A data warehouse integrates data from multiple data sources. For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product. It is consistent in the way that data from several sources is extracted and transformed. For example, coding conventions are standardized: M _ male, F _ female. Timevarying : Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where often only the most recent data is kept. For example, a transaction system may hold the most recent address of a customer, where a data warehouse can hold all addresses associated with a customer.
  • 4. Data are organized by various time-periods (e.g. months). Non-volatile : Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered. collection of data in support of the management's decision-making process. A data warehouse is a centralized repository that stores data from multiple information sources and transforms them into a common, multidimensional data model for efficient querying and analysis. Other extra charcter: 1.Client Server Architecture 2.Transperency 3.Flexible reporting 4.Multi user support Function of Data Ware house.(RDSSSD) 1. Roll Up: Data are summarized with generalization like weekly=>monthly=>annualy 2. Drill Down: Complement of roll up. Opposite 3. Pivot : cross tabulation(roatation) can be performed 4. Slice and Dice : projection operation is performed on the dimension 5. Sorting : data is sorted in some order(ascend/descend) 6. Selection: data is available by value or range 7. Derived computed attributes: Attributes are composed by operation on stored derived value.