SlideShare a Scribd company logo
Topic to be covered
◈ ROC Curves
◈ AUC Curves
◈ Feature engineering
◈ Confusion Matrix
ROC Curves
◈ A receiver operating characteristic
curve, i.e. ROC curve, is a graphical
plot that illustrates the diagnostic
ability of a binary classifier system as
its discrimination threshold is varied.
◈ A Receiver Operating Characteristic
(ROC) Curve is a way to compare
diagnostic tests. It is a plot of the true
positive rate against the false positive
rate.
What does a ROC Plot shows?
◈ The relationship between sensitivity and
specificity. For example, a decrease in
sensitivity results in an increase in
specificity.
◈ Test accuracy; the closer the graph is to the
top and left-hand borders, the more
accurate the test. Likewise, the closer the
graph to the diagonal, the less accurate the
test.
◈ A perfect test would go straight from zero
up the top-left corner and then straight
across the horizontal.
◈ The likelihood ratio; given by the derivative
at any particular cut point.
AUC Curves:
◈ As the name indicates, it is an area under the
curve calculated in the ROC space.
◈ One of the easy ways to calculate the AUC
score is using the trapezoidal rule, which is
adding up all trapezoids under the curve.
◈ the theoretical range of AUC score is between
0 and 1, the actual scores of meaningful
classifiers are greater than 0.5, which is the
AUC score of a random classifier
“
”
More data beats clever algorithms, but the
better data beats the more data.
--Peter Norvig
“
”
…some machine learning projects succeed and
some fail. What makes the difference? Easily
the most important factor is the features used.
–Pedro Domingos
Feature Engineering
◈ Most creative aspect of Data Science.
◈ Treat like any other creative endeavor, like
writing a comedy show:
◈ Hold brainstorming sessions
◈ Create templates / formula’s
◈ Check/revisit what worked before
Confusion matrix
◈ A common method for describing the
performance of a classification model
consisting of true positives, true negatives,
false positives, and false negatives.
◈ It is called a confusion matrix because it
shows how confused the model is between
the classes.
References
◈ https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Basic_concep
t
◈ https://towardsdatascience.com/feature-engineering-for-machine-learning-
3a5e293a5114

More Related Content

PPTX
Monte carlo simulation
PDF
Major airport air cargo forecasting
PDF
The use of Monte Carlo simulation in quantitative risk assessment of IT projects
PPT
Monte Carlo Simulation Methods
PPT
Models and uncertainty
PDF
Brussels airport forecast
PPT
Lecture11_ Evaluation Metrics for classification.ppt
PPTX
Roc curves
Monte carlo simulation
Major airport air cargo forecasting
The use of Monte Carlo simulation in quantitative risk assessment of IT projects
Monte Carlo Simulation Methods
Models and uncertainty
Brussels airport forecast
Lecture11_ Evaluation Metrics for classification.ppt
Roc curves

Similar to r_concepts (20)

PDF
3Assessing classification performance.pdf
PPTX
Group 1 ROC assignment explaining ROC and AUC
PDF
General Introduction to ROC Curves
PPTX
ASSIGNMENT.pptx. explaining about Rock curve
PPT
Receiver Operating Characteristics SIB-ROC.ppt
PPTX
Model Evaluation Matrix: Confusion Matrix, F1 Score, ROC curve AUC
PDF
An introduction to ROC analysis
PPTX
ROC curve.pptx
PDF
Noorbehbahani classification evaluation measure
PDF
Side 2019 #8
PPT
classifier_evaluation_lecture_ai_101.ppt
PPTX
Classification Evaluation Metrics (2).pptx
PPTX
Module 3_ Classification.pptx
PPTX
evolution of data mining.pptx
PPTX
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
PPTX
Introduction to ROC Curve Analysis with Application in Functional Genomics
PPTX
ROC Curve 101
PDF
Statistical terms for classification
PDF
Statistical termsforclassification
PPTX
Data mining model
3Assessing classification performance.pdf
Group 1 ROC assignment explaining ROC and AUC
General Introduction to ROC Curves
ASSIGNMENT.pptx. explaining about Rock curve
Receiver Operating Characteristics SIB-ROC.ppt
Model Evaluation Matrix: Confusion Matrix, F1 Score, ROC curve AUC
An introduction to ROC analysis
ROC curve.pptx
Noorbehbahani classification evaluation measure
Side 2019 #8
classifier_evaluation_lecture_ai_101.ppt
Classification Evaluation Metrics (2).pptx
Module 3_ Classification.pptx
evolution of data mining.pptx
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Introduction to ROC Curve Analysis with Application in Functional Genomics
ROC Curve 101
Statistical terms for classification
Statistical termsforclassification
Data mining model
Ad

Recently uploaded (20)

PPTX
PLANT CELL description and characteristics
PPTX
WATER RESOURCE-1.pptx ssssdsedsddsssssss
PDF
Landscape Architecture: Shaping the World Between Buildings
PPTX
Neoclassical and Mystery Plays Entertain
PDF
15901922083_ph.cology3.pdf..................................................
PPTX
QA PROCESS FLOW CHART (1).pptxbbbbbbbbbnnnn
PPTX
668819271-A Relibility CCEPTANCE-SAMPLING.pptx
PPTX
QA PROCESS FLOW CHART (1).pptxaaaaaaaaaaaa
PPTX
Slides-Archival-Moment-FGCCT-6Feb23.pptx
PPTX
Cloud Computing ppt.ppt1QU4FFIWEKWEIFRRGx
PPTX
National_Artists_for_Dance_with_Examples-1.pptx
PDF
Music-and-Arts_jwkskwjsjsjsjsjsjsjdisiaiajsjjzjz
PDF
Dating-Courtship-Marriage-and-Responsible-Parenthood.pdf
PDF
witch fraud storyboard sequence-_1x1.pdf
PDF
Celebrate Krishna Janmashtami 2025 | Cottage9
PDF
Impressionism-in-Arts.For.Those.Who.Seek.Academic.Novelty.pdf
PPTX
Contemporary Arts and the Potter of Thep
PDF
Mandala - the Indian dance history & science
PPTX
GREEN BUILDINGS are the ecofriendly buildings
PPTX
philippine contemporary artscot ppt.pptx
PLANT CELL description and characteristics
WATER RESOURCE-1.pptx ssssdsedsddsssssss
Landscape Architecture: Shaping the World Between Buildings
Neoclassical and Mystery Plays Entertain
15901922083_ph.cology3.pdf..................................................
QA PROCESS FLOW CHART (1).pptxbbbbbbbbbnnnn
668819271-A Relibility CCEPTANCE-SAMPLING.pptx
QA PROCESS FLOW CHART (1).pptxaaaaaaaaaaaa
Slides-Archival-Moment-FGCCT-6Feb23.pptx
Cloud Computing ppt.ppt1QU4FFIWEKWEIFRRGx
National_Artists_for_Dance_with_Examples-1.pptx
Music-and-Arts_jwkskwjsjsjsjsjsjsjdisiaiajsjjzjz
Dating-Courtship-Marriage-and-Responsible-Parenthood.pdf
witch fraud storyboard sequence-_1x1.pdf
Celebrate Krishna Janmashtami 2025 | Cottage9
Impressionism-in-Arts.For.Those.Who.Seek.Academic.Novelty.pdf
Contemporary Arts and the Potter of Thep
Mandala - the Indian dance history & science
GREEN BUILDINGS are the ecofriendly buildings
philippine contemporary artscot ppt.pptx
Ad

r_concepts

  • 1. Topic to be covered ◈ ROC Curves ◈ AUC Curves ◈ Feature engineering ◈ Confusion Matrix
  • 2. ROC Curves ◈ A receiver operating characteristic curve, i.e. ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. ◈ A Receiver Operating Characteristic (ROC) Curve is a way to compare diagnostic tests. It is a plot of the true positive rate against the false positive rate.
  • 3. What does a ROC Plot shows? ◈ The relationship between sensitivity and specificity. For example, a decrease in sensitivity results in an increase in specificity. ◈ Test accuracy; the closer the graph is to the top and left-hand borders, the more accurate the test. Likewise, the closer the graph to the diagonal, the less accurate the test. ◈ A perfect test would go straight from zero up the top-left corner and then straight across the horizontal. ◈ The likelihood ratio; given by the derivative at any particular cut point.
  • 4. AUC Curves: ◈ As the name indicates, it is an area under the curve calculated in the ROC space. ◈ One of the easy ways to calculate the AUC score is using the trapezoidal rule, which is adding up all trapezoids under the curve. ◈ the theoretical range of AUC score is between 0 and 1, the actual scores of meaningful classifiers are greater than 0.5, which is the AUC score of a random classifier
  • 5. “ ” More data beats clever algorithms, but the better data beats the more data. --Peter Norvig
  • 6. “ ” …some machine learning projects succeed and some fail. What makes the difference? Easily the most important factor is the features used. –Pedro Domingos
  • 7. Feature Engineering ◈ Most creative aspect of Data Science. ◈ Treat like any other creative endeavor, like writing a comedy show: ◈ Hold brainstorming sessions ◈ Create templates / formula’s ◈ Check/revisit what worked before
  • 8. Confusion matrix ◈ A common method for describing the performance of a classification model consisting of true positives, true negatives, false positives, and false negatives. ◈ It is called a confusion matrix because it shows how confused the model is between the classes.