SlideShare a Scribd company logo
www.studymafia.org
Submitted To: Submitted By:
www.studymafia.org www.studymafia.org
Seminar
On
Data Mining
Content
Data Mining
Data Mining Definition
Data Mining – Two Main Components
Data Mining vs. Data Analysis
What is (not) Data Mining?
Related Fields
Data Mining Process
Major Data Mining Tasks
Uses of Data Mining
Sources of Data for Mining
Challenges of Data Mining
Advantages
Conclusion
Reference
DataMining
New buzzword, old idea.
Inferring new information from already collected
data.
Traditionally job of Data Analysts
Computers have changed this.
Far more efficient to comb through data using a
machine than eyeballing statistical data.
DataMining Definition
Data mining in Data is the
process of identifying
potentially
and ultimately in data.
DataMining vs. DataAnalysis
In terms of software and the marketing thereof
Data Mining != Data Analysis
Data Mining implies software uses some
intelligence over simple grouping and partitioning
of data to infer new information.
Data Analysis is more in line with standard
statistical software (ie: web stats). These usually
present information about subsets and relations
within the recorded data set (ie: browser/search
engine usage, average visit time, etc. )
What is(not) DataMining?
Look up phone
number in phone
directory
Query a Web search
engine for information
about “ Amazon”
•Certain names are more
prevalent in certain US
locations (O’ Brien, O’
Rurke, O’ Reilly… in
Boston area)
• Group together similar
documents returned by
search engine according
to their context (e.g.
Amazon rainforest,
Amazon.com,)
Whatis notDataMining? Whatis DataMining?
DataMining Techniques
Classification
Clustering
Regression
Association Rules
Why Mine Data?Scientific Viewpoint
⚫ Data collected and stored at
enormous speeds (GB/hour)
o remote sensors on a satellite
o telescopes scanning the skies
o microarrays generating gene
expression data
o scientific simulations
generating terabytes of data
⚫ Traditional techniques infeasible for raw data
⚫ Data mining may help scientists
o in classifying and segmenting data
o in Hypothesis Formation
Data MiningArchitecture
Related Fields
Statistics
Machine
Learning
Databases
Visualization
DataMining and
Knowledge Discovery
__
__
__
__
__
__
__
__
__
Transformed
Data
Patterns
and
Rules
Target
Data
Raw
Data
Knowledge
DataMining
Data MiningProcess
DATA
Ware
house
Knowledge
Major DataMiningTasks
Classification: predicting an item class
Associations: e.g. A & B & C occur frequently
Visualization: to facilitate human discovery
Estimation: predicting a continuous value
Deviation Detection: finding changes
Link Analysis: finding relationships...
Usesof DataMining
AI/Machine Learning
Good for analyzing winning strategies to games, and
thus developing intelligent AI opponents. (ie: Chess)
Business Strategies
Identify customer demographics, preferences, and
purchasing patterns.
Risk Analysis
Analyze product defect rates for given plants and
predict possible complications (read: lawsuits) down
the line.
Usesof DataMining(Cont..)
User Behavior Validation
In the realm of cell phones
Comparing phone activity to calling records. Can
help detect calls made on cloned phones.
Similarly, with credit cards, comparing purchases
with historical purchases. Can detect activity
with stolen cards.
Usesof DataMining(Cont..)
Health and Science
Predicting protein interactions and functionality
within biological cells. Applications of this
research include determining causes and
possible cures for Alzheimers, Parkinson's, and
some cancers (caused by protein "misfolds")
Scanning Satellite receptions for possible
transmissions from other planets.
For more information see Stanford’ s
Folding@home and SETI@home projects. Both
involve participation in a widely distributed
computer application.
Sourcesof Datafor Mining
Databases (most obvious)
Text Documents
Computer Simulations
Social Networks
Advantagesof DataMining
Marketing / Retail
Finance / Banking
Manufacturing
Governments
Challenges of DataMining
Scalability
Dimensionality
Complex and Heterogeneous Data
Data Quality
Data Ownership and Distribution
Privacy Preservation
Streaming Data
Conclusion
Comprehensive data warehouses that integrate
operational data with customer, supplier, and market
information have resulted in an explosion of information.
Competition requires timely and sophisticated analysis
on an integrated view of the data.
However, there is a growing gap between more powerful
storage and retrieval systems and the users’ ability to
effectively analyze and act on the information they
contain.
Reference
www.google.com
www.wikipedia.com
www.studymafia.org
Thanks
Queries?

More Related Content

PPTX
data.2.pptx
PPTX
Data-Mining-ppt.pptx
PPTX
Data-Mining-ppt (1).pptx
PPT
Data Mining
PPT
1328cvkdlgkdgjfdkjgjdfgdfkgdflgkgdfglkjgld8679 - Copy.ppt
PPT
Part1
PPT
A Practical Approach To Data Mining Presentation
PPTX
Data mining and its applications!
data.2.pptx
Data-Mining-ppt.pptx
Data-Mining-ppt (1).pptx
Data Mining
1328cvkdlgkdgjfdkjgjdfgdfkgdflgkgdfglkjgld8679 - Copy.ppt
Part1
A Practical Approach To Data Mining Presentation
Data mining and its applications!

Similar to Data-Mining-ppt (1).pdf (20)

PPTX
Data mining-basic
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PDF
Chapter 1 Handoutfffffffffffffffffffffffffffffffffffff.pdf
PDF
Data mining 1 - Introduction (cheat sheet - printable)
PPTX
Information Technology Data Mining
PDF
Data Mining
PPTX
Data mining
PPT
Data mining
PPT
Introduction to Data Mining
PPTX
Data warehouse and data mining
PPTX
IT in Business: Chapter 11 Data Sciences
ODP
Data mining
ODP
Data mining
Data mining-basic
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Chapter 1 Handoutfffffffffffffffffffffffffffffffffffff.pdf
Data mining 1 - Introduction (cheat sheet - printable)
Information Technology Data Mining
Data Mining
Data mining
Data mining
Introduction to Data Mining
Data warehouse and data mining
IT in Business: Chapter 11 Data Sciences
Data mining
Data mining
Ad

Recently uploaded (20)

PPTX
Managing Community Partner Relationships
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Leprosy and NLEP programme community medicine
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
modul_python (1).pptx for professional and student
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
Managing Community Partner Relationships
SAP 2 completion done . PRESENTATION.pptx
Database Infoormation System (DBIS).pptx
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Qualitative Qantitative and Mixed Methods.pptx
Leprosy and NLEP programme community medicine
Topic 5 Presentation 5 Lesson 5 Corporate Fin
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
importance of Data-Visualization-in-Data-Science. for mba studnts
CYBER SECURITY the Next Warefare Tactics
Pilar Kemerdekaan dan Identi Bangsa.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Optimise Shopper Experiences with a Strong Data Estate.pdf
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
modul_python (1).pptx for professional and student
IBA_Chapter_11_Slides_Final_Accessible.pptx
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
Ad

Data-Mining-ppt (1).pdf