SlideShare a Scribd company logo
6
Most read
8
Most read
10
Most read
T.DEEPIKA
MSC(INFO-TECH)
NADAR SARASWATHI COLLEGE OF ARTS AND
SCIENCE
DATA MINING(DEFINITION)
 Data mining is the process of sorting through
large data sets to identify patterns and establish
relationships to solve problems through data
analysis. Data mining tools allow enterprises to
predict future trends.
 The term "data mining" is in fact a misnomer,
because the goal is the extraction of patterns
and knowledge from large amounts of data, not
the extraction (mining) of data itself.
 Data mining is an interdisciplinary subfield
of computer science and statistics with an
overall goal to extract information (with
intelligent methods) from a data set and
transform the information into a
comprehensible structure for further use. Data
mining is the analysis step of the "knowledge
discovery in databases" process, or KDD.
 Aside from the raw analysis step, it also
involves database and data
management aspects, data pre-
processing, model and inference considerations
, interestingness
metrics, complexity considerations, post-
processing of discovered
structures, visualization, and online updating.
 The difference between data analysis and data
mining is that data analysis is to summarize the
history such as analyzing the effectiveness of a
marketing campaign, in contrast, data mining
focuses on using specific machine learning and
statistical models to predict the future and
discover the patterns among data.
Knowledge Discovery in Databases (KDD)
 Knowledge discovery in databases (KDD) is the
process of discovering useful knowledge from a
collection of data. This widely used data mining
technique is a process that includes data preparation
and selection, data cleansing, incorporating prior
knowledge on data sets and interpreting accurate
solutions from the observed results.
 Major KDD application areas include marketing,
fraud detection, telecommunication and
manufacturing.
 Traditionally, data mining and knowledge discovery
was performed manually. As time passed, the amount
of data in many systems grew to larger than terabyte
size, and could no longer be maintained manually.
Moreover, for the successful existence of any
business, discovering underlying patterns in data is
considered essential. As a result, several software
tools were developed to discover hidden data and
make assumptions, which formed a part of artificial
intelligence.
 The KDD process has reached its peak in the
last 10 years. It now houses many different
approaches to discovery, which includes
inductive learning, Bayesian statistics,
semantic query optimization, knowledge
acquisition for expert systems and information
theory. The ultimate goal is to extract high-
level knowledge from low-level data.
PROCESS OF KDD:
STEPS IN KDD:
STAGES IN KDD:
 The overall process of finding and interpreting
patterns from data involves the repeated application of
the following steps:
 Developing an understanding of
 the application domain
 the relevant prior knowledge
 the goals of the end-user
 Creating a target data set: selecting a data set, or
focusing on a subset of variables, or data samples, on
which discovery is to be performed.
 Data cleaning and preprocessing.
 Removal of noise or outliers.
 Collecting necessary information to model or account
for noise.
 Strategies for handling missing data fields.
 Accounting for time sequence information and known
changes.
 Data reduction and projection.
 Finding useful features to represent the data depending
on the goal of the task.
 Using dimensionality reduction or transformation
methods to reduce the effective number of variables
under consideration or to find invariant representations
for the data.
 Choosing the data mining task.
 Deciding whether the goal of the KDD process is
classification, regression, clustering, etc.
 Choosing the data mining algorithm(s).
 Selecting method(s) to be used for searching for
patterns in the data.
 Deciding which models and parameters may be
appropriate.
 Matching a particular data mining method with the
overall criteria of the KDD process.
 Data mining.
 Searching for patterns of interest in a particular
representational form or a set of such representations as
classification rules or trees, regression, clustering, and
so forth.
 Interpreting mined patterns.
 Consolidating discovered knowledge.
THANK YOU

More Related Content

PPTX
Data Mining & Applications
PPTX
Functions of information retrival system(1)
PPT
Metadata: A concept
PDF
Data mining & data warehousing (ppt)
PPT
Data warehouse
PPTX
Big Data - The 5 Vs Everyone Must Know
PDF
Data Analytics
PPTX
OLAP & DATA WAREHOUSE
Data Mining & Applications
Functions of information retrival system(1)
Metadata: A concept
Data mining & data warehousing (ppt)
Data warehouse
Big Data - The 5 Vs Everyone Must Know
Data Analytics
OLAP & DATA WAREHOUSE

What's hot (20)

PPTX
Data warehousing
PPTX
Business Intelligence Module 3
PDF
Tableau And Data Visualization - Get Started
PPTX
Library consortia
PPTX
Lecture #01
PPT
Introduction to Data Warehouse
PPTX
Business Intelligence and decision support system
PPTX
Online analytical processing
PPT
Introduction to Data Mining
PPTX
multi dimensional data model
PPTX
Tools and techniques adopted for big data analytics
PPTX
Information System.pptx
DOC
Data warehouse concepts
PPTX
Data mining: Classification and prediction
PPTX
Data Mining: What is Data Mining?
PPTX
Implementation of dbms
PPTX
PPT
Introduction to Metadata
PDF
Introduction to Machine Learning with Spark
PPTX
Data mining presentation.ppt
Data warehousing
Business Intelligence Module 3
Tableau And Data Visualization - Get Started
Library consortia
Lecture #01
Introduction to Data Warehouse
Business Intelligence and decision support system
Online analytical processing
Introduction to Data Mining
multi dimensional data model
Tools and techniques adopted for big data analytics
Information System.pptx
Data warehouse concepts
Data mining: Classification and prediction
Data Mining: What is Data Mining?
Implementation of dbms
Introduction to Metadata
Introduction to Machine Learning with Spark
Data mining presentation.ppt
Ad

Similar to Data mining (20)

PPTX
Data mining
PDF
TTG Int.LTD Data Mining Technique
DOCX
knowledge discovery and data mining approach in databases (2)
PPTX
Introduction to Data Mining and Data Warehousing
PDF
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
PPTX
Lect 1 2 Data Mining.pptx for the predictive ananlysis
DOCX
KDD assignmnt data.docx
DOCX
Seminar Report Vaibhav
PDF
Chapter 1 Handoutfffffffffffffffffffffffffffffffffffff.pdf
PPT
Introduction To Data Mining
PPT
Introduction To Data Mining
PDF
DM-Unit-1-Part 1-R.pdf
PPTX
Data mining , Knowledge Discovery Process, Classification
PPTX
Data mining
PPTX
Seminar Presentation
DOCX
Data Warehose and Data Mining Unit II.docx
PDF
What Is Data Mining How It Works, Benefits, Techniques.pdf
PPT
Data Mining
PDF
Data Mining – A Perspective Approach
PPTX
Data mining, need , process and KDD Its steps process
Data mining
TTG Int.LTD Data Mining Technique
knowledge discovery and data mining approach in databases (2)
Introduction to Data Mining and Data Warehousing
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
Lect 1 2 Data Mining.pptx for the predictive ananlysis
KDD assignmnt data.docx
Seminar Report Vaibhav
Chapter 1 Handoutfffffffffffffffffffffffffffffffffffff.pdf
Introduction To Data Mining
Introduction To Data Mining
DM-Unit-1-Part 1-R.pdf
Data mining , Knowledge Discovery Process, Classification
Data mining
Seminar Presentation
Data Warehose and Data Mining Unit II.docx
What Is Data Mining How It Works, Benefits, Techniques.pdf
Data Mining
Data Mining – A Perspective Approach
Data mining, need , process and KDD Its steps process
Ad

More from DeepikaT13 (19)

PPTX
Mobile computing
PPTX
Image processing
PPT
aloha
PPT
Spatial filtering
PPT
Exceptions
PPTX
Hive architecture
PPTX
Rdbms
PPTX
Sotware engineering
PPT
Computer network
PPTX
Storage management in operating system
PPTX
PPTX
Neural network
PPTX
memory reference instruction
PPTX
breadth first search
PPTX
constructors
PPTX
Disjoint set
PPTX
Destructors
PPTX
Crisp set
PPTX
Computer registers
Mobile computing
Image processing
aloha
Spatial filtering
Exceptions
Hive architecture
Rdbms
Sotware engineering
Computer network
Storage management in operating system
Neural network
memory reference instruction
breadth first search
constructors
Disjoint set
Destructors
Crisp set
Computer registers

Recently uploaded (20)

PDF
Computing-Curriculum for Schools in Ghana
PPTX
Pharma ospi slides which help in ospi learning
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
Cell Structure & Organelles in detailed.
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Lesson notes of climatology university.
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Pre independence Education in Inndia.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
01-Introduction-to-Information-Management.pdf
Computing-Curriculum for Schools in Ghana
Pharma ospi slides which help in ospi learning
O7-L3 Supply Chain Operations - ICLT Program
human mycosis Human fungal infections are called human mycosis..pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Cell Structure & Organelles in detailed.
Final Presentation General Medicine 03-08-2024.pptx
Supply Chain Operations Speaking Notes -ICLT Program
STATICS OF THE RIGID BODIES Hibbelers.pdf
Lesson notes of climatology university.
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Anesthesia in Laparoscopic Surgery in India
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Pre independence Education in Inndia.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
01-Introduction-to-Information-Management.pdf

Data mining

  • 2. DATA MINING(DEFINITION)  Data mining is the process of sorting through large data sets to identify patterns and establish relationships to solve problems through data analysis. Data mining tools allow enterprises to predict future trends.  The term "data mining" is in fact a misnomer, because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (mining) of data itself.
  • 3.  Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.
  • 4.  Aside from the raw analysis step, it also involves database and data management aspects, data pre- processing, model and inference considerations , interestingness metrics, complexity considerations, post- processing of discovered structures, visualization, and online updating.
  • 5.  The difference between data analysis and data mining is that data analysis is to summarize the history such as analyzing the effectiveness of a marketing campaign, in contrast, data mining focuses on using specific machine learning and statistical models to predict the future and discover the patterns among data.
  • 6. Knowledge Discovery in Databases (KDD)  Knowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. This widely used data mining technique is a process that includes data preparation and selection, data cleansing, incorporating prior knowledge on data sets and interpreting accurate solutions from the observed results.  Major KDD application areas include marketing, fraud detection, telecommunication and manufacturing.
  • 7.  Traditionally, data mining and knowledge discovery was performed manually. As time passed, the amount of data in many systems grew to larger than terabyte size, and could no longer be maintained manually. Moreover, for the successful existence of any business, discovering underlying patterns in data is considered essential. As a result, several software tools were developed to discover hidden data and make assumptions, which formed a part of artificial intelligence.
  • 8.  The KDD process has reached its peak in the last 10 years. It now houses many different approaches to discovery, which includes inductive learning, Bayesian statistics, semantic query optimization, knowledge acquisition for expert systems and information theory. The ultimate goal is to extract high- level knowledge from low-level data.
  • 11. STAGES IN KDD:  The overall process of finding and interpreting patterns from data involves the repeated application of the following steps:  Developing an understanding of  the application domain  the relevant prior knowledge  the goals of the end-user
  • 12.  Creating a target data set: selecting a data set, or focusing on a subset of variables, or data samples, on which discovery is to be performed.  Data cleaning and preprocessing.  Removal of noise or outliers.  Collecting necessary information to model or account for noise.  Strategies for handling missing data fields.  Accounting for time sequence information and known changes.
  • 13.  Data reduction and projection.  Finding useful features to represent the data depending on the goal of the task.  Using dimensionality reduction or transformation methods to reduce the effective number of variables under consideration or to find invariant representations for the data.  Choosing the data mining task.  Deciding whether the goal of the KDD process is classification, regression, clustering, etc.
  • 14.  Choosing the data mining algorithm(s).  Selecting method(s) to be used for searching for patterns in the data.  Deciding which models and parameters may be appropriate.  Matching a particular data mining method with the overall criteria of the KDD process.
  • 15.  Data mining.  Searching for patterns of interest in a particular representational form or a set of such representations as classification rules or trees, regression, clustering, and so forth.  Interpreting mined patterns.  Consolidating discovered knowledge.