SlideShare a Scribd company logo
STATISTICVS DATA MINING
@andrybrew
OBJECTIVE (FOR BOTH)
Both used for Data Analysis, But They are both different tools	

Statistical Role is to describe more or less efficient a dataset
while DM is to model for predict, simulate and optimize
STATISTICS FACTS
Well established, centuries old methodology of science	

No scope of heuristics think	

Use sample to generalize the conclusion about the population (hypothese testing, p-values, etc). It needs
confidence level of our generalization	

Provide Theory first and test it using statistical tools	

Deal with structured data in order to solve structured problems, result are software/researcher independent,
inference reflects statistical hypothesis testing	

Knowledge are not hidden, we are directly able to observe the knowledge. It prove our observation
(hypothese) scientifically, so the community will accept our hypothese.	

Concern about data collection	

It has problem with too little data available and unable to uncover knowledge from complex (interactions) data
DATA MINING FACTS
Just come recently with the availibility of large volume and complex data	

make generous use of heuristics think	

Used on population (or very large data), to find the pattern in the data	

Dig out the data and find some patterns, and then make theories	

Deal with structured data in order to solve unstructured problems, result are software/researcher dependent,
inference reflects computational properties of data mining algorithm at hand.Accurate prediction is more desirable
than the explanation	

Exploratory tool, we have no idea about the hidden knowledge of the data and it let us discover those invisible
knowledge	

Less concerned about data collection	

No problem with data size and able to uncover knowledge from complex (relations) data and also difficult for direct
observations
COMPARISON
STATISTICS DATA MINING
Confirmative Explorative
Small Data Set Larga Data Set
Small Number of Variable Large Number of Variable
Deductive (no predictions) Inductive
Numeric Data Numeric and Non-Numeric Data
Clean Data Data Cleaning
source from slideshare.net
DATA SCIENCE
Data science is the study of the
generalizable extraction of knowledge
from data,[1] yet the key word is science.
[2] It incorporates varying elements and
builds on techniques and theories from
many fields, including signal processing,
mathematics, probability models, machine
learning, statistical learning, computer
programming, data engineering, pattern
recognition and learning, visualization,
uncertainty modeling, data warehousing,
and high performance computing with the
goal of extracting meaning from data and
creating data products.
CONCLUSION
The availibility of large volume data set should make business and
sosial science (as well as other sciences) to use DM tools	

Business and Sosical Science (as well as other sciences) need to
use more DM tools, because of the usability of DM to model,
predict and optimize phenomenon	

DM/KDD/Data Science are more and more utilized as standard
of decision making in modern business

More Related Content

PDF
Introduction to Data Mining
PPTX
Introduction of Data Science and Data Analytics
PDF
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
PPTX
introduction to data science
PPTX
Introduction to Datamining Concept and Techniques
PPTX
Data Mining
PPTX
Data Mining
Introduction to Data Mining
Introduction of Data Science and Data Analytics
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
introduction to data science
Introduction to Datamining Concept and Techniques
Data Mining
Data Mining

What's hot (20)

PDF
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
PDF
Data analytics beyond data processing and how it affects Industry 4.0
PPT
18231979 Data Mining
PPTX
Data mining and its applications!
PDF
Knowledge Representation on the Web
PDF
Data mining and Machine learning expained in jargon free & lucid language
PDF
Data Mining: Future Trends and Applications
PPT
Data mining
DOC
DATA MINING.doc
PDF
Prov-O-Viz: Interactive Provenance Visualization
PDF
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
PPT
Introduction-to-Knowledge Discovery in Database
PDF
An Ecosystem for Linked Humanities Data
PPTX
Data mining - Process, Techniques and Research Topics
PPTX
Data science
PDF
The Nature of Data
PPT
Data mining and knowledge Discovery
PDF
Data science technology overview
PDF
Identical Users in Different Social Media Provides Uniform Network Structure ...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
Data analytics beyond data processing and how it affects Industry 4.0
18231979 Data Mining
Data mining and its applications!
Knowledge Representation on the Web
Data mining and Machine learning expained in jargon free & lucid language
Data Mining: Future Trends and Applications
Data mining
DATA MINING.doc
Prov-O-Viz: Interactive Provenance Visualization
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
Introduction-to-Knowledge Discovery in Database
An Ecosystem for Linked Humanities Data
Data mining - Process, Techniques and Research Topics
Data science
The Nature of Data
Data mining and knowledge Discovery
Data science technology overview
Identical Users in Different Social Media Provides Uniform Network Structure ...
Ad

Similar to Data Mining vs Statistics (20)

PPTX
UNIT1-2.pptx
PDF
Introduction to Data Analysis Course Notes.pdf
PDF
Untitled document.pdf
DOCX
What is data science artical
PPTX
Data Science topic and introduction to basic concepts involving data manageme...
PDF
data science course with placement in hyderabad
PPTX
Research EDU821-1.pptx
PPT
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
PDF
Data Science and the future .The game changer .
PDF
Information & data science (1) converted
PDF
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
PPTX
Roles of Datascience.pptx
PDF
DAVLectuer3 Exploratory data analysis .pdf
PPTX
Big Data Analytics concepts with full theory
PPTX
data analytics vs data analysis understanding the differencespptx
PPTX
Introduction to Data Analytics
PPTX
ds.pptx
PDF
Introduction to Data Science: Unveiling Insights Hidden in Data
PDF
Why L-3 Data Tactics Data Science?
PDF
Data Science Demystified_ Journeying Through Insights and Innovations
UNIT1-2.pptx
Introduction to Data Analysis Course Notes.pdf
Untitled document.pdf
What is data science artical
Data Science topic and introduction to basic concepts involving data manageme...
data science course with placement in hyderabad
Research EDU821-1.pptx
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Data Science and the future .The game changer .
Information & data science (1) converted
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Roles of Datascience.pptx
DAVLectuer3 Exploratory data analysis .pdf
Big Data Analytics concepts with full theory
data analytics vs data analysis understanding the differencespptx
Introduction to Data Analytics
ds.pptx
Introduction to Data Science: Unveiling Insights Hidden in Data
Why L-3 Data Tactics Data Science?
Data Science Demystified_ Journeying Through Insights and Innovations
Ad

More from Andry Alamsyah (20)

PDF
ChatGPT for Academic
PDF
Central Bank Digital Currency (CBDC): Best Practice and Technical Considerations
PDF
Peran Generasi Milenial di Era 4.0
PDF
Big Data Analytics : Understanding for Research Activity
PDF
Education 4.0
PDF
Artificial Neural Network for Predicting Indonesia Stock Exchange Composite u...
PPTX
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELING
PPTX
Finding Pattern in Dynamic Network Analysis
PPTX
Ontology Modelling Approach for Personality Measurement based on Social Media...
PDF
Open Data Analytical Model for Human Development Index to Support Government ...
PPTX
Hybrid sentiment and network analysis of social opinion polarization icoict
PDF
Pilkada DKI 2017 Social Network Model (Early Report)
PDF
Understanding new digital economy
PPTX
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
PDF
Big Data Analytics : A Social Network Approach
PDF
Social Network, Metrics and Computational Problem
PDF
Jejaring Sosial untuk Peneliti dan Litbang
PDF
Social network for academics
PPTX
Content era
PPT
Komputer grafik
ChatGPT for Academic
Central Bank Digital Currency (CBDC): Best Practice and Technical Considerations
Peran Generasi Milenial di Era 4.0
Big Data Analytics : Understanding for Research Activity
Education 4.0
Artificial Neural Network for Predicting Indonesia Stock Exchange Composite u...
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELING
Finding Pattern in Dynamic Network Analysis
Ontology Modelling Approach for Personality Measurement based on Social Media...
Open Data Analytical Model for Human Development Index to Support Government ...
Hybrid sentiment and network analysis of social opinion polarization icoict
Pilkada DKI 2017 Social Network Model (Early Report)
Understanding new digital economy
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Big Data Analytics : A Social Network Approach
Social Network, Metrics and Computational Problem
Jejaring Sosial untuk Peneliti dan Litbang
Social network for academics
Content era
Komputer grafik

Recently uploaded (20)

PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
famous lake in india and its disturibution and importance
PDF
. Radiology Case Scenariosssssssssssssss
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPT
protein biochemistry.ppt for university classes
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PDF
An interstellar mission to test astrophysical black holes
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Cell Membrane: Structure, Composition & Functions
famous lake in india and its disturibution and importance
. Radiology Case Scenariosssssssssssssss
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
2. Earth - The Living Planet Module 2ELS
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Phytochemical Investigation of Miliusa longipes.pdf
protein biochemistry.ppt for university classes
Biophysics 2.pdffffffffffffffffffffffffff
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Introduction to Cardiovascular system_structure and functions-1
POSITIONING IN OPERATION THEATRE ROOM.ppt
7. General Toxicologyfor clinical phrmacy.pptx
Placing the Near-Earth Object Impact Probability in Context
INTRODUCTION TO EVS | Concept of sustainability
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
An interstellar mission to test astrophysical black holes
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice

Data Mining vs Statistics

  • 2. OBJECTIVE (FOR BOTH) Both used for Data Analysis, But They are both different tools Statistical Role is to describe more or less efficient a dataset while DM is to model for predict, simulate and optimize
  • 3. STATISTICS FACTS Well established, centuries old methodology of science No scope of heuristics think Use sample to generalize the conclusion about the population (hypothese testing, p-values, etc). It needs confidence level of our generalization Provide Theory first and test it using statistical tools Deal with structured data in order to solve structured problems, result are software/researcher independent, inference reflects statistical hypothesis testing Knowledge are not hidden, we are directly able to observe the knowledge. It prove our observation (hypothese) scientifically, so the community will accept our hypothese. Concern about data collection It has problem with too little data available and unable to uncover knowledge from complex (interactions) data
  • 4. DATA MINING FACTS Just come recently with the availibility of large volume and complex data make generous use of heuristics think Used on population (or very large data), to find the pattern in the data Dig out the data and find some patterns, and then make theories Deal with structured data in order to solve unstructured problems, result are software/researcher dependent, inference reflects computational properties of data mining algorithm at hand.Accurate prediction is more desirable than the explanation Exploratory tool, we have no idea about the hidden knowledge of the data and it let us discover those invisible knowledge Less concerned about data collection No problem with data size and able to uncover knowledge from complex (relations) data and also difficult for direct observations
  • 5. COMPARISON STATISTICS DATA MINING Confirmative Explorative Small Data Set Larga Data Set Small Number of Variable Large Number of Variable Deductive (no predictions) Inductive Numeric Data Numeric and Non-Numeric Data Clean Data Data Cleaning source from slideshare.net
  • 6. DATA SCIENCE Data science is the study of the generalizable extraction of knowledge from data,[1] yet the key word is science. [2] It incorporates varying elements and builds on techniques and theories from many fields, including signal processing, mathematics, probability models, machine learning, statistical learning, computer programming, data engineering, pattern recognition and learning, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products.
  • 7. CONCLUSION The availibility of large volume data set should make business and sosial science (as well as other sciences) to use DM tools Business and Sosical Science (as well as other sciences) need to use more DM tools, because of the usability of DM to model, predict and optimize phenomenon DM/KDD/Data Science are more and more utilized as standard of decision making in modern business