Machine Learning Performance metrics for classification

Performance Metrics
Confusion Matrix, ROC curve, AUC, F1 score

Evaluation of Outlier Detection – ROC & AUC
• Standard measures for evaluating anomaly detection problems:
• Recall (Detection rate) - Ratio between the number of correctly detected
anomalies and the total number of anomalies
• False alarm (false positive) rate – Ratio between the number of data
records from normal class that are misclassified as anomalies and the
total number of data records from normal class
• ROC Curve is a trade-off between detection rate (TPR) and false alarm
rate (FPR).
• Area under the ROC curve (AUC) is computed using a trapezoid rule.
Predicted
class
Confusion
matrix
NC C
NC TN FP
Actual
class C FN TP
anomaly class – C
normal class – NC
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
ROC curves for different outlier detection techniques
False alarm rate
Detection
rate
AUC
Ideal
ROC
curve

Evaluation of Anomaly Detection – F(or F1)-Score
Predicted
class
Confusion
matrix
NC C
NC TN FP
Actual
class C FN TP
anomaly class – C
normal class – NC
– Accuracy = (TP+TN) / (TP+TN+FP+FN)
– Precision (P) = TP / (TP + FP)
– Recall (R) = TP / (TP + FN)
• F1 Score focuses on both recall and precision
• F – measure = 2*R*P / (R+P)

Why F(or F1) - Score is need?
Difference between F1 Score and Accuracy
• Accuracy is contributed by a large number of True Negatives
• Not considers much on False Negative and False Positive usually has business costs (tangible &
intangible).
• F1 Score is better measure to use if need a balance between Precision and Recall and there is an
uneven class distribution (large number of Actual Negatives).
 Accuracy is not sufficient metric for evaluation
• Example: network traffic data set with 99.9% of normal data and 0.1% of intrusions
• Trivial classifier that labels everything with the normal class can achieve 99.9% accuracy

Machine Learning Performance metrics for classification

More Related Content

What's hot (20)

Similar to Machine Learning Performance metrics for classification (20)

More from Kuppusamy P (20)

Recently uploaded (20)

Machine Learning Performance metrics for classification