SlideShare a Scribd company logo
Logistic Regression
Legal Notices and Disclaimers
This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES,
EXPRESS OR IMPLIED, IN THIS SUMMARY.
Intel technologies’ features and benefits depend on system configuration and may require
enabled hardware, software or service activation. Performance varies depending on system
configuration. Check with your system manufacturer or retailer or learn more at intel.com.
This sample source code is released under the Intel Sample Source Code License Agreement.
Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2017, Intel Corporation. All rights reserved.
Introduction to Logistic Regression
Number of Positive Nodes
Patient
Status
After Five
Years
Survived
Lost
Linear Regression for Classification?
Number of Positive Nodes
Survived
LostPatient
Status
After Five
Years
𝑦 𝛽 𝑥 = 𝛽0 + 𝛽1 𝑥 + ε
Linear Regression for Classification?
Number of Positive Nodes
Survived: 0.0
Lost: 1.0Patient
Status
After Five
Years
0.5
𝑦 𝛽 𝑥 = 𝛽0 + 𝛽1 𝑥 + ε
Linear Regression for Classification?
Number of Positive Nodes
Survived: 0.0
Lost: 1.0Patient
Status
After Five
Years
If model result > 0.5: predict lost
If model result < 0.5: predict survived
0.5
Linear Regression for Classification?
Number of Positive Nodes
Survived: 0.0
Lost: 1.0Patient
Status
After Five
Years
If model result > 0.5: predict lost
If model result < 0.5: predict survived
0.5
0 0
0000
1 1 1 1 1 1 1
Prediction
What is this Function?
0.0
1.0
0.2
0.4
0.6
0.8
0-5-10 5 10
𝑦 =
1
1+𝑒−𝑥
The Decision Boundary
Number of Positive Nodes
Survived: 0.0
Lost: 1.0Patient
Status
After Five
Years
0.5
𝑦 𝛽 𝑥 =
1
1+𝑒−(𝛽0+ 𝛽1 𝑥 + ε )
Logistic Regression
Number of Positive Nodes
Survived: 0.0
Lost: 1.0Patient
Status
After Five
Years
0.5
𝑦 𝛽 𝑥 =
1
1+𝑒−(𝛽0+ 𝛽1 𝑥 + ε )
The Decision Boundary
Number of Positive Nodes
Survived: 0.0
Lost: 1.0Patient
Status
After Five
Years
0.5
𝑦 𝛽 𝑥 =
1
1+𝑒−(𝛽0+ 𝛽1 𝑥 + ε )
Relationship of Logistic to Linear Regression
Logistic
Function
𝑃 𝑥 =
1
1 + 𝑒−(𝛽0+ 𝛽1 𝑥 + ε )
Relationship of Logistic to Linear Regression
Logistic
Function
𝑃 𝑥 =
1
1 + 𝑒−(𝛽0+ 𝛽1 𝑥 + ε )
𝑃 𝑥 =
𝑒(𝛽0+ 𝛽1 𝑥)
1+𝑒(𝛽0+ 𝛽1 𝑥)
Relationship of Logistic to Linear Regression
Logistic
Function
𝑃 𝑥 =
𝑒(𝛽0+ 𝛽1 𝑥)
1+𝑒(𝛽0+ 𝛽1 𝑥)
Relationship of Logistic to Linear Regression
𝑃 𝑥 =
𝑒(𝛽0+ 𝛽1 𝑥)
1+𝑒(𝛽0+ 𝛽1 𝑥)
𝑃 𝑥
1 − 𝑃 𝑥
= 𝑒 𝛽0+ 𝛽1 𝑥
Logistic
Function
Odds
Ratio
𝑙𝑜𝑔
𝑃 𝑥
1 − 𝑃 𝑥
= 𝛽0 + 𝛽1 𝑥
Relationship of Logistic to Linear Regression
𝑃 𝑥 =
𝑒(𝛽0+ 𝛽1 𝑥)
1+𝑒(𝛽0+ 𝛽1 𝑥)
Logistic
Function
Log
Odds
𝑙𝑜𝑔
𝑃 𝑥
1 − 𝑃 𝑥
= 𝛽0 + 𝛽1 𝑥
Relationship of Logistic to Linear Regression
𝑃 𝑥 =
𝑒(𝛽0+ 𝛽1 𝑥)
1+𝑒(𝛽0+ 𝛽1 𝑥)
Logistic
Function
Log
Odds
Classification with Logistic Regression
Number of Positive Nodes
Survived: 0.0
Lost: 1.0Patient
Status
After Five
Years
0.5
One feature (nodes)
Two labels (survived, lost)
Number of Malignant Nodes
0
Age
60
40
20
10 20
Two features (nodes, age)
Two labels (survived, lost)
Classification with Logistic Regression
Number of Malignant Nodes
0
Age
60
40
20
10 20
Two features (nodes, age)
Two labels (survived, lost)
Classification with Logistic Regression
Decision
Boundary
Number of Malignant Nodes
0
Age
60
40
20
10 20
Two features (nodes, age)
Two labels (survived, lost)
new example
(predict)
Classification with Logistic Regression
Decision
Boundary
Number of Malignant Nodes
0
Age
60
40
20
10 20
Two features (nodes, age)
Three labels (survived, complications,
lost)
Multiclass Classification with Logistic Regression
Number of Malignant Nodes
0
Age
60
40
20
10 20
One vs All: Survived vs All
Number of Malignant Nodes
0
Age
60
40
20
10 20
One vs All: Complications vs All
Number of Malignant Nodes
0
Age
60
40
20
10 20
One vs All: Loss vs All
Number of Malignant Nodes
0
Age
60
40
20
10 20
Assign most probable class to each region
Multiclass Decision Boundary
Import the class containing the classification method
from sklearn.linear_model import LogisticRegression
Create an instance of the class
LR = LogisticRegression(penalty='l2', c=10.0)
Fit the instance on the data and then predict the expected value
LR = LR.fit(X_train, y_train)
y_predict = LR.predict(X_test)
Tune regularization parameters with cross-validation: LogisticRegressionCV.
Logistic Regression: The Syntax
Import the class containing the classification method
from sklearn.linear_model import LogisticRegression
Create an instance of the class
LR = LogisticRegression(penalty='l2', c=10.0)
Fit the instance on the data and then predict the expected value
LR = LR.fit(X_train, y_train)
y_predict = LR.predict(X_test)
Tune regularization parameters with cross-validation: LogisticRegressionCV.
Logistic Regression: The Syntax
Import the class containing the classification method
from sklearn.linear_model import LogisticRegression
Create an instance of the class
LR = LogisticRegression(penalty='l2', c=10.0)
Fit the instance on the data and then predict the expected value
LR = LR.fit(X_train, y_train)
y_predict = LR.predict(X_test)
Tune regularization parameters with cross-validation: LogisticRegressionCV.
Logistic Regression: The Syntax
regularization
parameters
Import the class containing the classification method
from sklearn.linear_model import LogisticRegression
Create an instance of the class
LR = LogisticRegression(penalty='l2', c=10.0)
Fit the instance on the data and then predict the expected value
LR = LR.fit(X_train, y_train)
y_predict = LR.predict(X_test)
Tune regularization parameters with cross-validation: LogisticRegressionCV.
Logistic Regression: The Syntax
Logistic Regression: The Syntax
Import the class containing the classification method
from sklearn.linear_model import LogisticRegression
Create an instance of the class
LR = LogisticRegression(penalty='l2', c=10.0)
Fit the instance on the data and then predict the expected value
LR = LR.fit(X_train, y_train)
y_predict = LR.predict(X_test)
Tune regularization parameters with cross-validation: LogisticRegressionCV.
Ml3 logistic regression-and_classification_error_metrics
Classification Error Metrics
• You are asked to build a classifier for leukemia
• Training data: 1% patients with leukemia, 99% healthy
• Measure accuracy: total % of predictions that are
correct
Choosing the Right Error Measurement
• You are asked to build a classifier for leukemia
• Training data: 1% patients with leukemia, 99% healthy
• Measure accuracy: total % of predictions that are
correct
• Build a simple model that always predicts "healthy"
• Accuracy will be 99%...
Choosing the Right Error Measurement
Predicted
Positive
Predicted
Negative
True Positive
(TP)
False Negative
(FN)
Actual
Positive
False Positive
(FP)
True Negative
(TN)
Actual
Negative
Confusion MatrixConfusion Matrix
Predicted
Positive
Predicted
Negative
True Positive
(TP)
False Negative
(FN)
Actual
Positive
False Positive
(FP)
True Negative
(TN)
Actual
Negative
Type I Error
Confusion MatrixConfusion Matrix
Type II Error
Predicted
Positive
Predicted
Negative
True Positive
(TP)
False Negative
(FN)
Actual
Positive
False Positive
(FP)
True Negative
(TN)
Actual
Negative
Confusion MatrixAccuracy: Predicting Correctly
Accuracy =
TP + TN
TP + FN + FP + TN
Predicted
Positive
Predicted
Negative
True Positive
(TP)
False Negative
(FN)
Actual
Positive
False Positive
(FP)
True Negative
(TN)
Actual
Negative
Confusion MatrixRecall: Identifying All Positive Instances
Recall or
Sensitivity
TP
TP +
FN
=
Predicted
Positive
Predicted
Negative
True Positive
(TP)
False Negative
(FN)
Actual
Positive
False Positive
(FP)
True Negative
(TN)
Actual
Negative
Confusion MatrixPrecision: Identifying Only Positive Instances
Precision
=
TP
TP + FP
Predicted
Positive
Predicted
Negative
True Positive
(TP)
False Negative
(FN)
Actual
Positive
False Positive
(FP)
True Negative
(TN)
Actual
Negative
Confusion MatrixSpecificity: Avoiding False Alarms
Specificity =
TN
FP + TN
Predicted
Positive
Predicted
Negative
True Positive
(TP)
False Negative
(FN)
Actual
Positive
False Positive
(FP)
True Negative
(TN)
Actual
Negative
Confusion MatrixError Measurements
Accuracy =
TP + TN
TP + FN + FP + TN
Precision =
TP
TP + FP
Predicted
Positive
Predicted
Negative
True Positive
(TP)
False Negative
(FN)
Actual
Positive
False Positive
(FP)
True Negative
(TN)
Actual
Negative
Confusion MatrixError Measurements
Accuracy =
TP + TN
TP + FN + FP + TN
Precision = TP
TP + FP
Specificity =
TN
FP + TN
Recall or
Sensitivity
TP
TP + FN
=
Predicted
Positive
Predicted
Negative
True Positive
(TP)
False Negative
(FN)
Actual
Positive
False Positive
(FP)
True Negative
(TN)
Actual
Negative
Confusion MatrixError Measurements
Accuracy =
TP + TN
TP + FN + FP + TN
Precision =
TP
TP + FP
Specificity =
TN
FP + TN
Recall or
Sensitivity
TP
TP + FN
=
F1 = 2
Precision * Recall
Precision + Recall
Random
Guess
Worse
Better
0.2
0.4
0.6
0.8
1.0
0.2 0.4 0.6 0.8 1.0
Receiver Operating Characteristic (ROC)
Evaluation of model at all possible thresholds
Perfect
Model
False Positive Rate (1 – Specificity)
TruePositiveRate(Sensitivity)
Measures total area under ROC curve
False Positive Rate (1 – Specificity)
TruePositiveRate(Sensitivity)
AUC 0.5
AUC 0.75
AUC 0.9
0.2
0.4
0.6
0.8
1.0
0.2 0.4 0.6 0.8 1.0
Area Under Curve (AUC)
Recall
Precision
0.2
0.4
0.6
0.8
1.0
0.2 0.4 0.6 0.8 1.0
Precision Recall Curve (PR Curve)
Model 1
Model 2
Measures trade-off between precision and recall
Multiple Class Error Metrics
Accuracy =
TP1 + TP2 + TP3
𝐼𝑛𝑐𝑜𝑟𝑟𝑒𝑐𝑡
𝐶𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑠
Predicted
Class 1
Predicted
Class 2
TP1
Actual
Class 1
TP2
Actual
Class 2
Predicted
Class 3
Actual
Class 3
TP3
Most multi-class error
metrics are similar to
binary versions—
just expand elements
as a sum
Incorrect
Classifications
Multiple Class Error Metrics
Accuracy =
TP1 + TP2 + TP3
Predicted
Class 1
Predicted
Class 2
TP1
Actual
Class 1
TP2
Actual
Class 2
Predicted
Class 3
Actual
Class 3
TP3
Most multi-class error
metrics are similar to
binary versions—
just expand elements
as a sum
Total
Multiple Class Error Metrics
Predicted
Class 1
Predicted
Class 2
TP1
Actual
Class 1
TP2
Actual
Class 2
Predicted
Class 3
Actual
Class 3
TP3
Most multi-class error
metrics are similar to
binary versions—
just expand elements
as a sum
Accuracy =
TP1 + TP2 + TP3
Total
Import the desired error function
from sklearn.metrics import accuracy_score
Classification Error Metrics: The Syntax
Import the desired error function
from sklearn.metrics import accuracy_score
Calculate the error on the test and predicted data sets
accuracy_value = accuracy_score(y_test, y_pred)
Classification Error Metrics: The Syntax
Import the desired error function
from sklearn.metrics import accuracy_score
Calculate the error on the test and predicted data sets
accuracy_value = accuracy_score(y_test, y_pred)
Lots of other error metrics and diagnostic tools:
from sklearn.metrics import precision_score, recall_score,
f1_score, roc_auc_score,
confusion_matrix, roc_curve,
precision_recall_curve
Classification Error Metrics: The Syntax
Ml3 logistic regression-and_classification_error_metrics

More Related Content

PDF
Logistic regression : Use Case | Background | Advantages | Disadvantages
PPTX
Machine learning session4(linear regression)
PPTX
ML - Multiple Linear Regression
PPTX
Linear regression
PPTX
Machine Learning lecture4(logistic regression)
PDF
Logistic regression
PPT
Classification (ML).ppt
PPSX
Decision tree Using c4.5 Algorithm
Logistic regression : Use Case | Background | Advantages | Disadvantages
Machine learning session4(linear regression)
ML - Multiple Linear Regression
Linear regression
Machine Learning lecture4(logistic regression)
Logistic regression
Classification (ML).ppt
Decision tree Using c4.5 Algorithm

What's hot (20)

PPTX
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
PDF
Logistic Regression Analysis
PDF
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
PPTX
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
PPT
Logistic regression
PPTX
Logistic regression
PDF
Logistic regression
PPTX
Logistic regression
PPTX
Logistic regression with SPSS
PPTX
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
PDF
An Overview of Simple Linear Regression
PPTX
Logistic regression
PPTX
Statistics for data science
PPTX
Linear Regression and Logistic Regression in ML
PPTX
Logistic regression
PDF
Model selection and cross validation techniques
PPT
Regression analysis
PPTX
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
PDF
7. logistics regression using spss
PDF
03 Machine Learning Linear Algebra
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression Analysis
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Logistic regression
Logistic regression
Logistic regression
Logistic regression
Logistic regression with SPSS
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
An Overview of Simple Linear Regression
Logistic regression
Statistics for data science
Linear Regression and Logistic Regression in ML
Logistic regression
Model selection and cross validation techniques
Regression analysis
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
7. logistics regression using spss
03 Machine Learning Linear Algebra
Ad

Similar to Ml3 logistic regression-and_classification_error_metrics (20)

PDF
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...
PDF
3ml.pdf
PPTX
Logistic Regression in machine learning ppt
PPTX
lec+5+_part+1 cloud .pptx
PPTX
Logistic Regression power point presentation.pptx
PPTX
Lecture 3.1_ Logistic Regression.pptx
PPTX
Predictive analytics and Type of Predictive Analytics
PDF
Logistic Regression Classifier - Conceptual Guide
PDF
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
PPTX
Machine learning session5(logistic regression)
PPTX
Lecture 3.1_ Logistic Regression powerpoint
PDF
CSE357 fa21 (6) Linear Machine Learning11-11.pdf
PDF
Module -6.pdf Machine Learning Types and examples
PDF
15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf
PDF
Logistic regression in Machine Learning
PPTX
Machine_Learning.pptx
PPTX
Classification Algortyhm of Machine Learning
PDF
working with python
PDF
logisticregression-190726150723.pdf
PDF
Logistic-Regression - Machine learning model
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...
3ml.pdf
Logistic Regression in machine learning ppt
lec+5+_part+1 cloud .pptx
Logistic Regression power point presentation.pptx
Lecture 3.1_ Logistic Regression.pptx
Predictive analytics and Type of Predictive Analytics
Logistic Regression Classifier - Conceptual Guide
Machine Learning with Python- Machine Learning Algorithms- Logistic Regressio...
Machine learning session5(logistic regression)
Lecture 3.1_ Logistic Regression powerpoint
CSE357 fa21 (6) Linear Machine Learning11-11.pdf
Module -6.pdf Machine Learning Types and examples
15-Data Analytics in IoT - Supervised Learning-04-09-2024.pdf
Logistic regression in Machine Learning
Machine_Learning.pptx
Classification Algortyhm of Machine Learning
working with python
logisticregression-190726150723.pdf
Logistic-Regression - Machine learning model
Ad

More from ankit_ppt (20)

PPTX
Deep learning summary
PPTX
08 neural networks
PPTX
07 learning
PPTX
06 image features
PPTX
05 contours seg_matching
PPTX
04 image transformations_ii
PPTX
03 image transformations_i
PPTX
02 image processing
PPTX
01 foundations
PPTX
Word2 vec
PPTX
Text similarity measures
PPTX
Text generation and_advanced_topics
PPTX
Nlp toolkits and_preprocessing_techniques
PPTX
Matrix decomposition and_applications_to_nlp
PPTX
Machine learning and_nlp
PPTX
Latent dirichlet allocation_and_topic_modeling
PPTX
Intro to nlp
PPTX
Ot regularization and_gradient_descent
PPTX
Ml10 dimensionality reduction-and_advanced_topics
PPTX
Ml9 introduction to-unsupervised_learning_and_clustering_methods
Deep learning summary
08 neural networks
07 learning
06 image features
05 contours seg_matching
04 image transformations_ii
03 image transformations_i
02 image processing
01 foundations
Word2 vec
Text similarity measures
Text generation and_advanced_topics
Nlp toolkits and_preprocessing_techniques
Matrix decomposition and_applications_to_nlp
Machine learning and_nlp
Latent dirichlet allocation_and_topic_modeling
Intro to nlp
Ot regularization and_gradient_descent
Ml10 dimensionality reduction-and_advanced_topics
Ml9 introduction to-unsupervised_learning_and_clustering_methods

Recently uploaded (20)

PDF
Digital Logic Computer Design lecture notes
DOCX
573137875-Attendance-Management-System-original
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
Construction Project Organization Group 2.pptx
PDF
composite construction of structures.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PDF
Well-logging-methods_new................
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Welding lecture in detail for understanding
Digital Logic Computer Design lecture notes
573137875-Attendance-Management-System-original
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Foundation to blockchain - A guide to Blockchain Tech
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Construction Project Organization Group 2.pptx
composite construction of structures.pdf
Internet of Things (IOT) - A guide to understanding
Strings in CPP - Strings in C++ are sequences of characters used to store and...
Mechanical Engineering MATERIALS Selection
Lecture Notes Electrical Wiring System Components
Lesson 3_Tessellation.pptx finite Mathematics
Well-logging-methods_new................
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Welding lecture in detail for understanding

Ml3 logistic regression-and_classification_error_metrics

  • 2. Legal Notices and Disclaimers This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. Check with your system manufacturer or retailer or learn more at intel.com. This sample source code is released under the Intel Sample Source Code License Agreement. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 2017, Intel Corporation. All rights reserved.
  • 3. Introduction to Logistic Regression Number of Positive Nodes Patient Status After Five Years Survived Lost
  • 4. Linear Regression for Classification? Number of Positive Nodes Survived LostPatient Status After Five Years 𝑦 𝛽 𝑥 = 𝛽0 + 𝛽1 𝑥 + ε
  • 5. Linear Regression for Classification? Number of Positive Nodes Survived: 0.0 Lost: 1.0Patient Status After Five Years 0.5 𝑦 𝛽 𝑥 = 𝛽0 + 𝛽1 𝑥 + ε
  • 6. Linear Regression for Classification? Number of Positive Nodes Survived: 0.0 Lost: 1.0Patient Status After Five Years If model result > 0.5: predict lost If model result < 0.5: predict survived 0.5
  • 7. Linear Regression for Classification? Number of Positive Nodes Survived: 0.0 Lost: 1.0Patient Status After Five Years If model result > 0.5: predict lost If model result < 0.5: predict survived 0.5 0 0 0000 1 1 1 1 1 1 1 Prediction
  • 8. What is this Function? 0.0 1.0 0.2 0.4 0.6 0.8 0-5-10 5 10 𝑦 = 1 1+𝑒−𝑥
  • 9. The Decision Boundary Number of Positive Nodes Survived: 0.0 Lost: 1.0Patient Status After Five Years 0.5 𝑦 𝛽 𝑥 = 1 1+𝑒−(𝛽0+ 𝛽1 𝑥 + ε )
  • 10. Logistic Regression Number of Positive Nodes Survived: 0.0 Lost: 1.0Patient Status After Five Years 0.5 𝑦 𝛽 𝑥 = 1 1+𝑒−(𝛽0+ 𝛽1 𝑥 + ε )
  • 11. The Decision Boundary Number of Positive Nodes Survived: 0.0 Lost: 1.0Patient Status After Five Years 0.5 𝑦 𝛽 𝑥 = 1 1+𝑒−(𝛽0+ 𝛽1 𝑥 + ε )
  • 12. Relationship of Logistic to Linear Regression Logistic Function 𝑃 𝑥 = 1 1 + 𝑒−(𝛽0+ 𝛽1 𝑥 + ε )
  • 13. Relationship of Logistic to Linear Regression Logistic Function 𝑃 𝑥 = 1 1 + 𝑒−(𝛽0+ 𝛽1 𝑥 + ε ) 𝑃 𝑥 = 𝑒(𝛽0+ 𝛽1 𝑥) 1+𝑒(𝛽0+ 𝛽1 𝑥)
  • 14. Relationship of Logistic to Linear Regression Logistic Function 𝑃 𝑥 = 𝑒(𝛽0+ 𝛽1 𝑥) 1+𝑒(𝛽0+ 𝛽1 𝑥)
  • 15. Relationship of Logistic to Linear Regression 𝑃 𝑥 = 𝑒(𝛽0+ 𝛽1 𝑥) 1+𝑒(𝛽0+ 𝛽1 𝑥) 𝑃 𝑥 1 − 𝑃 𝑥 = 𝑒 𝛽0+ 𝛽1 𝑥 Logistic Function Odds Ratio
  • 16. 𝑙𝑜𝑔 𝑃 𝑥 1 − 𝑃 𝑥 = 𝛽0 + 𝛽1 𝑥 Relationship of Logistic to Linear Regression 𝑃 𝑥 = 𝑒(𝛽0+ 𝛽1 𝑥) 1+𝑒(𝛽0+ 𝛽1 𝑥) Logistic Function Log Odds
  • 17. 𝑙𝑜𝑔 𝑃 𝑥 1 − 𝑃 𝑥 = 𝛽0 + 𝛽1 𝑥 Relationship of Logistic to Linear Regression 𝑃 𝑥 = 𝑒(𝛽0+ 𝛽1 𝑥) 1+𝑒(𝛽0+ 𝛽1 𝑥) Logistic Function Log Odds
  • 18. Classification with Logistic Regression Number of Positive Nodes Survived: 0.0 Lost: 1.0Patient Status After Five Years 0.5 One feature (nodes) Two labels (survived, lost)
  • 19. Number of Malignant Nodes 0 Age 60 40 20 10 20 Two features (nodes, age) Two labels (survived, lost) Classification with Logistic Regression
  • 20. Number of Malignant Nodes 0 Age 60 40 20 10 20 Two features (nodes, age) Two labels (survived, lost) Classification with Logistic Regression Decision Boundary
  • 21. Number of Malignant Nodes 0 Age 60 40 20 10 20 Two features (nodes, age) Two labels (survived, lost) new example (predict) Classification with Logistic Regression Decision Boundary
  • 22. Number of Malignant Nodes 0 Age 60 40 20 10 20 Two features (nodes, age) Three labels (survived, complications, lost) Multiclass Classification with Logistic Regression
  • 23. Number of Malignant Nodes 0 Age 60 40 20 10 20 One vs All: Survived vs All
  • 24. Number of Malignant Nodes 0 Age 60 40 20 10 20 One vs All: Complications vs All
  • 25. Number of Malignant Nodes 0 Age 60 40 20 10 20 One vs All: Loss vs All
  • 26. Number of Malignant Nodes 0 Age 60 40 20 10 20 Assign most probable class to each region Multiclass Decision Boundary
  • 27. Import the class containing the classification method from sklearn.linear_model import LogisticRegression Create an instance of the class LR = LogisticRegression(penalty='l2', c=10.0) Fit the instance on the data and then predict the expected value LR = LR.fit(X_train, y_train) y_predict = LR.predict(X_test) Tune regularization parameters with cross-validation: LogisticRegressionCV. Logistic Regression: The Syntax
  • 28. Import the class containing the classification method from sklearn.linear_model import LogisticRegression Create an instance of the class LR = LogisticRegression(penalty='l2', c=10.0) Fit the instance on the data and then predict the expected value LR = LR.fit(X_train, y_train) y_predict = LR.predict(X_test) Tune regularization parameters with cross-validation: LogisticRegressionCV. Logistic Regression: The Syntax
  • 29. Import the class containing the classification method from sklearn.linear_model import LogisticRegression Create an instance of the class LR = LogisticRegression(penalty='l2', c=10.0) Fit the instance on the data and then predict the expected value LR = LR.fit(X_train, y_train) y_predict = LR.predict(X_test) Tune regularization parameters with cross-validation: LogisticRegressionCV. Logistic Regression: The Syntax regularization parameters
  • 30. Import the class containing the classification method from sklearn.linear_model import LogisticRegression Create an instance of the class LR = LogisticRegression(penalty='l2', c=10.0) Fit the instance on the data and then predict the expected value LR = LR.fit(X_train, y_train) y_predict = LR.predict(X_test) Tune regularization parameters with cross-validation: LogisticRegressionCV. Logistic Regression: The Syntax
  • 31. Logistic Regression: The Syntax Import the class containing the classification method from sklearn.linear_model import LogisticRegression Create an instance of the class LR = LogisticRegression(penalty='l2', c=10.0) Fit the instance on the data and then predict the expected value LR = LR.fit(X_train, y_train) y_predict = LR.predict(X_test) Tune regularization parameters with cross-validation: LogisticRegressionCV.
  • 34. • You are asked to build a classifier for leukemia • Training data: 1% patients with leukemia, 99% healthy • Measure accuracy: total % of predictions that are correct Choosing the Right Error Measurement
  • 35. • You are asked to build a classifier for leukemia • Training data: 1% patients with leukemia, 99% healthy • Measure accuracy: total % of predictions that are correct • Build a simple model that always predicts "healthy" • Accuracy will be 99%... Choosing the Right Error Measurement
  • 36. Predicted Positive Predicted Negative True Positive (TP) False Negative (FN) Actual Positive False Positive (FP) True Negative (TN) Actual Negative Confusion MatrixConfusion Matrix
  • 37. Predicted Positive Predicted Negative True Positive (TP) False Negative (FN) Actual Positive False Positive (FP) True Negative (TN) Actual Negative Type I Error Confusion MatrixConfusion Matrix Type II Error
  • 38. Predicted Positive Predicted Negative True Positive (TP) False Negative (FN) Actual Positive False Positive (FP) True Negative (TN) Actual Negative Confusion MatrixAccuracy: Predicting Correctly Accuracy = TP + TN TP + FN + FP + TN
  • 39. Predicted Positive Predicted Negative True Positive (TP) False Negative (FN) Actual Positive False Positive (FP) True Negative (TN) Actual Negative Confusion MatrixRecall: Identifying All Positive Instances Recall or Sensitivity TP TP + FN =
  • 40. Predicted Positive Predicted Negative True Positive (TP) False Negative (FN) Actual Positive False Positive (FP) True Negative (TN) Actual Negative Confusion MatrixPrecision: Identifying Only Positive Instances Precision = TP TP + FP
  • 41. Predicted Positive Predicted Negative True Positive (TP) False Negative (FN) Actual Positive False Positive (FP) True Negative (TN) Actual Negative Confusion MatrixSpecificity: Avoiding False Alarms Specificity = TN FP + TN
  • 42. Predicted Positive Predicted Negative True Positive (TP) False Negative (FN) Actual Positive False Positive (FP) True Negative (TN) Actual Negative Confusion MatrixError Measurements Accuracy = TP + TN TP + FN + FP + TN Precision = TP TP + FP
  • 43. Predicted Positive Predicted Negative True Positive (TP) False Negative (FN) Actual Positive False Positive (FP) True Negative (TN) Actual Negative Confusion MatrixError Measurements Accuracy = TP + TN TP + FN + FP + TN Precision = TP TP + FP Specificity = TN FP + TN Recall or Sensitivity TP TP + FN =
  • 44. Predicted Positive Predicted Negative True Positive (TP) False Negative (FN) Actual Positive False Positive (FP) True Negative (TN) Actual Negative Confusion MatrixError Measurements Accuracy = TP + TN TP + FN + FP + TN Precision = TP TP + FP Specificity = TN FP + TN Recall or Sensitivity TP TP + FN = F1 = 2 Precision * Recall Precision + Recall
  • 45. Random Guess Worse Better 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Receiver Operating Characteristic (ROC) Evaluation of model at all possible thresholds Perfect Model False Positive Rate (1 – Specificity) TruePositiveRate(Sensitivity)
  • 46. Measures total area under ROC curve False Positive Rate (1 – Specificity) TruePositiveRate(Sensitivity) AUC 0.5 AUC 0.75 AUC 0.9 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Area Under Curve (AUC)
  • 47. Recall Precision 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Precision Recall Curve (PR Curve) Model 1 Model 2 Measures trade-off between precision and recall
  • 48. Multiple Class Error Metrics Accuracy = TP1 + TP2 + TP3 𝐼𝑛𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝐶𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑠 Predicted Class 1 Predicted Class 2 TP1 Actual Class 1 TP2 Actual Class 2 Predicted Class 3 Actual Class 3 TP3 Most multi-class error metrics are similar to binary versions— just expand elements as a sum Incorrect Classifications
  • 49. Multiple Class Error Metrics Accuracy = TP1 + TP2 + TP3 Predicted Class 1 Predicted Class 2 TP1 Actual Class 1 TP2 Actual Class 2 Predicted Class 3 Actual Class 3 TP3 Most multi-class error metrics are similar to binary versions— just expand elements as a sum Total
  • 50. Multiple Class Error Metrics Predicted Class 1 Predicted Class 2 TP1 Actual Class 1 TP2 Actual Class 2 Predicted Class 3 Actual Class 3 TP3 Most multi-class error metrics are similar to binary versions— just expand elements as a sum Accuracy = TP1 + TP2 + TP3 Total
  • 51. Import the desired error function from sklearn.metrics import accuracy_score Classification Error Metrics: The Syntax
  • 52. Import the desired error function from sklearn.metrics import accuracy_score Calculate the error on the test and predicted data sets accuracy_value = accuracy_score(y_test, y_pred) Classification Error Metrics: The Syntax
  • 53. Import the desired error function from sklearn.metrics import accuracy_score Calculate the error on the test and predicted data sets accuracy_value = accuracy_score(y_test, y_pred) Lots of other error metrics and diagnostic tools: from sklearn.metrics import precision_score, recall_score, f1_score, roc_auc_score, confusion_matrix, roc_curve, precision_recall_curve Classification Error Metrics: The Syntax