SlideShare a Scribd company logo
1
Confusion Matrix
CONTENTS
BINARY CLASSIFIER
CONFUSION MATRIX
ACCURACY
PRECISION
RECALL
2
3
Binary Classifier
 A binary classifier produces output with two class values or labels,
such as Yes/No, 1/0, Positive/Negative for given input data
 A dataset used for performance evaluation is called a test dataset
 Observed labels are used to compare with the predicted labels
for performance evaluation after classification
 The predicted labels will be exactly the same
if the performance of a classifier is perfect
 But it is uncommon to be able to develop a perfect classifier
4
Confusion Matrix
 A confusion matrix is formed from the four outcomes
produced as a result of binary classification
 True positive (TP): correct positive prediction
 False positive (FP): incorrect positive prediction
 True negative (TN): correct negative prediction
 False negative (FN): incorrect negative prediction
5
6
Confusion Matrix
 Classifier
 Levels : Green / Grey
Confusion Matrix
7
Confusion Matrix
Confusion Matrix
True Positives
Green examples
correctly identified as
green
True Negatives
Gray examples
correctly identified as
grey
False Positives
Gray examples falsely
identified as green
False Negatives
Green examples
falsely identified as
gray
8
Accuracy
 Accuracy is calculated as the number of all correct predictions
divided by the total number of the dataset
 The best ACC is 1.0, whereas the worst is 0.0
 Accuracy = (9+8) / (9+2+1+8) = 0.85
9
Precision
 Precision is calculated as the number of correct positive predictions
divided by the total number of positive predictions
 The best precision is 1.0, whereas the worst is 0.0
 Precision = 9 / (9+2) = 0.818
10
Recall
 Sensitivity = Recall = True Positive Rate
 Recall is calculated as the number of correct positive predictions
divided by the total number of positives
 The best recall is 1.0, whereas the worst is 0.0
 Recall = 9 / (9+1) = 0.9
11
Example 1
 Example: The example to classify whether images contain either a dog or a cat
 The training data contains 25000 images of dogs and cats;
 The training data 75% of 25000 images; (25000*0.75 = 18750)
 Validation data 25% of training data; (25000*0.25 = 6250)
Test Data, 5 cats, 5 dogs
 Precision = 2/(2 + 0) * 100% = 100%
 Recall = 2/(2 + 3) * 100% = 40%
 Accuracy = (2 + 5)/(2 + 0 + 3 + 5) * 100% = 70%
12
Matrix 3x3
TP
FP
TN
FN
13
Example 2
= = = 170/300 = .556
= = .5
= = .5
= = .667
= 0.556
= = .3
= = .6
= = 0.8
P = 0.556
14
Thank you.

More Related Content

PPTX
MACHINE LEARNING PPT K MEANS CLUSTERING.
PPT
Lecture11_ Evaluation Metrics for classification.ppt
PPT
MLlectureMethod.ppt
PPT
MLlectureMethod.ppt
PPTX
Confusion Matrix and Sampling in ML.pptx
PPTX
Classification in data mining from lectures
PDF
Assessing Model Performance - Beginner's Guide
PPTX
ML-ChapterFour-ModelEvaluation.pptx
MACHINE LEARNING PPT K MEANS CLUSTERING.
Lecture11_ Evaluation Metrics for classification.ppt
MLlectureMethod.ppt
MLlectureMethod.ppt
Confusion Matrix and Sampling in ML.pptx
Classification in data mining from lectures
Assessing Model Performance - Beginner's Guide
ML-ChapterFour-ModelEvaluation.pptx

Similar to confusion data mining algorithm _updated.ppt (20)

PDF
Resampling methods Cross Validation Bootstrap Bias and variance estimation...
PPT
2.8 accuracy and ensemble methods
PPT
12 13 h2_measurement_ppt
PPTX
Model Performance Metrics. Accuracy, Precision, Recall
PPTX
performance evaluation good for data analytics
PPTX
An algorithm for building
PPT
Cluster validity 1.ppt on cluster validity
PPTX
Classification Evaluation Metrics (2).pptx
PPT
Data-Handling part 1 .ppt
PDF
3Assessing classification performance.pdf
PDF
19 9742 the application paper id 0016(edit ty)
PPTX
module_of_healthcare_wound_healing_mbbs_3.pptx
PDF
Machine Learning using biased data
PDF
Classification assessment methods
PPTX
Predictive analytics using 'R' Programming
PPT
Estimating sample size through simulations
PDF
Deep Learning for Computer Vision: Image Classification (UPC 2016)
PDF
NUMERICA METHODS 1 final touch summary for test 1
PPTX
Q4_Week1_Measures of Position(ungrouped data).pptx
PPTX
Week8 Live Lecture for Final Exam
Resampling methods Cross Validation Bootstrap Bias and variance estimation...
2.8 accuracy and ensemble methods
12 13 h2_measurement_ppt
Model Performance Metrics. Accuracy, Precision, Recall
performance evaluation good for data analytics
An algorithm for building
Cluster validity 1.ppt on cluster validity
Classification Evaluation Metrics (2).pptx
Data-Handling part 1 .ppt
3Assessing classification performance.pdf
19 9742 the application paper id 0016(edit ty)
module_of_healthcare_wound_healing_mbbs_3.pptx
Machine Learning using biased data
Classification assessment methods
Predictive analytics using 'R' Programming
Estimating sample size through simulations
Deep Learning for Computer Vision: Image Classification (UPC 2016)
NUMERICA METHODS 1 final touch summary for test 1
Q4_Week1_Measures of Position(ungrouped data).pptx
Week8 Live Lecture for Final Exam
Ad

More from AhmedSalama337512 (7)

PPTX
Association Rule Mining data mining.pptx
PPTX
شرح ال FP Growth data mining algorithm.pptx
PPTX
Decision Tree data mining algorithm .pptx
PDF
American English pronunciation lesson 0 overview
PPTX
Lec02_Database System Concepts and Architecture_part1.pptx
PDF
introductiontooopinpython-171115114144.pdf
PDF
Introduction-to-Python-print-datatype.pdf
Association Rule Mining data mining.pptx
شرح ال FP Growth data mining algorithm.pptx
Decision Tree data mining algorithm .pptx
American English pronunciation lesson 0 overview
Lec02_Database System Concepts and Architecture_part1.pptx
introductiontooopinpython-171115114144.pdf
Introduction-to-Python-print-datatype.pdf
Ad

Recently uploaded (20)

PDF
Mega Projects Data Mega Projects Data
PPTX
Global journeys: estimating international migration
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
Foundation of Data Science unit number two notes
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Introduction to Business Data Analytics.
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Launch Your Data Science Career in Kochi – 2025
PDF
Fluorescence-microscope_Botany_detailed content
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Mega Projects Data Mega Projects Data
Global journeys: estimating international migration
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Foundation of Data Science unit number two notes
Miokarditis (Inflamasi pada Otot Jantung)
Major-Components-ofNKJNNKNKNKNKronment.pptx
.pdf is not working space design for the following data for the following dat...
Introduction to Business Data Analytics.
STUDY DESIGN details- Lt Col Maksud (21).pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Launch Your Data Science Career in Kochi – 2025
Fluorescence-microscope_Botany_detailed content
Reliability_Chapter_ presentation 1221.5784
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd

confusion data mining algorithm _updated.ppt

  • 3. 3 Binary Classifier  A binary classifier produces output with two class values or labels, such as Yes/No, 1/0, Positive/Negative for given input data  A dataset used for performance evaluation is called a test dataset  Observed labels are used to compare with the predicted labels for performance evaluation after classification  The predicted labels will be exactly the same if the performance of a classifier is perfect  But it is uncommon to be able to develop a perfect classifier
  • 4. 4 Confusion Matrix  A confusion matrix is formed from the four outcomes produced as a result of binary classification  True positive (TP): correct positive prediction  False positive (FP): incorrect positive prediction  True negative (TN): correct negative prediction  False negative (FN): incorrect negative prediction
  • 5. 5
  • 6. 6 Confusion Matrix  Classifier  Levels : Green / Grey Confusion Matrix
  • 7. 7 Confusion Matrix Confusion Matrix True Positives Green examples correctly identified as green True Negatives Gray examples correctly identified as grey False Positives Gray examples falsely identified as green False Negatives Green examples falsely identified as gray
  • 8. 8 Accuracy  Accuracy is calculated as the number of all correct predictions divided by the total number of the dataset  The best ACC is 1.0, whereas the worst is 0.0  Accuracy = (9+8) / (9+2+1+8) = 0.85
  • 9. 9 Precision  Precision is calculated as the number of correct positive predictions divided by the total number of positive predictions  The best precision is 1.0, whereas the worst is 0.0  Precision = 9 / (9+2) = 0.818
  • 10. 10 Recall  Sensitivity = Recall = True Positive Rate  Recall is calculated as the number of correct positive predictions divided by the total number of positives  The best recall is 1.0, whereas the worst is 0.0  Recall = 9 / (9+1) = 0.9
  • 11. 11 Example 1  Example: The example to classify whether images contain either a dog or a cat  The training data contains 25000 images of dogs and cats;  The training data 75% of 25000 images; (25000*0.75 = 18750)  Validation data 25% of training data; (25000*0.25 = 6250) Test Data, 5 cats, 5 dogs  Precision = 2/(2 + 0) * 100% = 100%  Recall = 2/(2 + 3) * 100% = 40%  Accuracy = (2 + 5)/(2 + 0 + 3 + 5) * 100% = 70%
  • 13. 13 Example 2 = = = 170/300 = .556 = = .5 = = .5 = = .667 = 0.556 = = .3 = = .6 = = 0.8 P = 0.556