SlideShare a Scribd company logo
Dataset Analysis
Presented By
Nazmul Hyder
ID : 011 131 085
Section : SB
Contents
❑ Dataset Name
❑ Classifiers
❑ Dataset Description
❑ Dataset Analysis
❑ Graphical representation.
❑ References
Datasets Name
❏ Mushroom.
❏ Wine-Quality.
❏ Flags.
❏ ZOO.
Classifiers
❏kNN
❏NBC
❏Decision Tree (J48)
❏oneR
❏Random Forest
Dataset Description
Dataset name No of
instances
No of
attributes
Attribute
type
Class
value
Data
denoted
Donor
Mushroom 8124 22 nominal 2 1987 Jeff Schlimmer
Wine-Quality 1599 12 numeric 6
(nominal)
2009 Paulo Cortez,
Antonio Cerdeira,
Fernando Almeida
Flags 194 30 nominal 194
(nominal)
1990 Richard S. Forsyth
ZOO 101 17 nominal 8
(nominal)
1990 Richard S. Forsyth
Dataset Analysis:
Mushroom-Cross validation(10 folds)
Classifier Accuracy Error Rate Recall Precision F-score
kNN (k=3%) 59.6135% 40.3865% 0.596 0.576 0.583
NBC 64.5126% 35.4874% 0.645 0.769 0.665
j4.8 61.9645% 38.0355% 0.620 0.629 0.623
oneR 57.9025% 42.0975% 0.579 0.411 0.469
Random Forest 47.3043% 52.6957% 0.473 0.476 0.474
Dataset Analysis (con.)
Wine-Quality-Cross validation(10 folds)
Classifier Accuracy Error Rate Recall Precision F-score
kNN (k=3%) 57.7236% 42.2764% 0.577 0.542 0.553
NBC 55.0344% 44.9656% 0.550 0.554 0.550
j4.8 61.4759% 38.5241% 0.615 0.612 0.613
oneR 54.6592% 45.3408% 0.547 0.496 0.511
Random Forest 70.1063% 29.8337% 0.701 0.679 0.684
Flags - Cross validation(10 folds)
Classifier Accuracy Error Rate Recall Precision F-score
kNN (k=3%) 59.2789% 40.7216% 0.593 0.553 0.550
NBC 55.1546% 44.8454% 0.552 0.571 0.542
j4.8 59.2784% 40.7216% 0.593 0.570 0.576
oneR 4.6392% 95.3608% 0.046 0.002 0.004
Random Forest 61.3402% 38.6598% 0.613 0.545 0.572
Dataset Analysis (con.)
ZOO - Cross validation(10 folds)
Classifier Accuracy Error Rate Recall Precision F-score
kNN (k=3%) 94.1176% 5.8824% 0.941 0.935 0.931
NBC 95.098% 4.902% 0.951 0.953 0.950
j4.8 92.1569% 7.8431% 0.922 0.916 0.915
oneR 2.9412% 97.0588% 0.029 0.039 0.026
Random Forest 92.1569% 7.8431% 0.922 0.874 0.896
Dataset Analysis (con.)
Classifier result comparison :
References :
Quick Links :
Mushroom:https://archive.ics.uci.edu/ml/datasets/mushroom
Wine Quality:https://archive.ics.uci.edu/ml/datasets/wine+quality
Flags : https://archive.ics.uci.edu/ml/datasets/Flags
ZOO: http://archive.ics.uci.edu/ml/datasets/Zoo
URL : http://archive.ics.uci.edu/ml/datasets.html
Thank You

More Related Content

PPTX
Ebs vs fusion
PDF
Gltrm
PDF
DOC
Data mining techniques using weka
PDF
An Introduction to Data Mining with R
PPT
1.8 discretization
PDF
Import Data using R
PPTX
Analysis of Tree in Computer Based Application
Ebs vs fusion
Gltrm
Data mining techniques using weka
An Introduction to Data Mining with R
1.8 discretization
Import Data using R
Analysis of Tree in Computer Based Application

More from Nazmul Hyder (9)

PDF
Classification by clustering
PPTX
Language Translator ( Compiler)
PPTX
Linux Shell Scripts and Shell Commands✌️
PPTX
Huffman coding
PDF
ODOO documentation(e-commerce +accounting+purchase+inventory+invoice+HR+ POS)
PPTX
E-commerce (System Analysis and Design)
PPTX
Benchmark analysis (Online Shopping System)
PPTX
Online medicine store (using ODOO)
PPTX
Data analysis in artificial intelligence
Classification by clustering
Language Translator ( Compiler)
Linux Shell Scripts and Shell Commands✌️
Huffman coding
ODOO documentation(e-commerce +accounting+purchase+inventory+invoice+HR+ POS)
E-commerce (System Analysis and Design)
Benchmark analysis (Online Shopping System)
Online medicine store (using ODOO)
Data analysis in artificial intelligence
Ad

Recently uploaded (20)

PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Basic Mud Logging Guide for educational purpose
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
RMMM.pdf make it easy to upload and study
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Classroom Observation Tools for Teachers
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
Basic Mud Logging Guide for educational purpose
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
STATICS OF THE RIGID BODIES Hibbelers.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
TR - Agricultural Crops Production NC III.pdf
RMMM.pdf make it easy to upload and study
2.FourierTransform-ShortQuestionswithAnswers.pdf
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
VCE English Exam - Section C Student Revision Booklet
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Classroom Observation Tools for Teachers
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Ad

Dataset Analysis using weka tools (pattern recognition)

  • 1. Dataset Analysis Presented By Nazmul Hyder ID : 011 131 085 Section : SB
  • 2. Contents ❑ Dataset Name ❑ Classifiers ❑ Dataset Description ❑ Dataset Analysis ❑ Graphical representation. ❑ References
  • 3. Datasets Name ❏ Mushroom. ❏ Wine-Quality. ❏ Flags. ❏ ZOO.
  • 5. Dataset Description Dataset name No of instances No of attributes Attribute type Class value Data denoted Donor Mushroom 8124 22 nominal 2 1987 Jeff Schlimmer Wine-Quality 1599 12 numeric 6 (nominal) 2009 Paulo Cortez, Antonio Cerdeira, Fernando Almeida Flags 194 30 nominal 194 (nominal) 1990 Richard S. Forsyth ZOO 101 17 nominal 8 (nominal) 1990 Richard S. Forsyth
  • 6. Dataset Analysis: Mushroom-Cross validation(10 folds) Classifier Accuracy Error Rate Recall Precision F-score kNN (k=3%) 59.6135% 40.3865% 0.596 0.576 0.583 NBC 64.5126% 35.4874% 0.645 0.769 0.665 j4.8 61.9645% 38.0355% 0.620 0.629 0.623 oneR 57.9025% 42.0975% 0.579 0.411 0.469 Random Forest 47.3043% 52.6957% 0.473 0.476 0.474
  • 7. Dataset Analysis (con.) Wine-Quality-Cross validation(10 folds) Classifier Accuracy Error Rate Recall Precision F-score kNN (k=3%) 57.7236% 42.2764% 0.577 0.542 0.553 NBC 55.0344% 44.9656% 0.550 0.554 0.550 j4.8 61.4759% 38.5241% 0.615 0.612 0.613 oneR 54.6592% 45.3408% 0.547 0.496 0.511 Random Forest 70.1063% 29.8337% 0.701 0.679 0.684
  • 8. Flags - Cross validation(10 folds) Classifier Accuracy Error Rate Recall Precision F-score kNN (k=3%) 59.2789% 40.7216% 0.593 0.553 0.550 NBC 55.1546% 44.8454% 0.552 0.571 0.542 j4.8 59.2784% 40.7216% 0.593 0.570 0.576 oneR 4.6392% 95.3608% 0.046 0.002 0.004 Random Forest 61.3402% 38.6598% 0.613 0.545 0.572 Dataset Analysis (con.)
  • 9. ZOO - Cross validation(10 folds) Classifier Accuracy Error Rate Recall Precision F-score kNN (k=3%) 94.1176% 5.8824% 0.941 0.935 0.931 NBC 95.098% 4.902% 0.951 0.953 0.950 j4.8 92.1569% 7.8431% 0.922 0.916 0.915 oneR 2.9412% 97.0588% 0.029 0.039 0.026 Random Forest 92.1569% 7.8431% 0.922 0.874 0.896 Dataset Analysis (con.)
  • 11. References : Quick Links : Mushroom:https://archive.ics.uci.edu/ml/datasets/mushroom Wine Quality:https://archive.ics.uci.edu/ml/datasets/wine+quality Flags : https://archive.ics.uci.edu/ml/datasets/Flags ZOO: http://archive.ics.uci.edu/ml/datasets/Zoo URL : http://archive.ics.uci.edu/ml/datasets.html