2. OUTLINE
1. Introduction
2. Types of ROC Curves
3. Area Under the Curve (AUC) and
Interpretation
4. Practical Example and
Visualization
5. Partial AUC and Its Importance
6. Effect of Thresholds
7. Advantages and Disadvantages of
ROC Curves
8. Potential Issues in ROC Analysis
9. Comparison with Other Methods
3. Introduction
• The ROC curve is a statistical tool used to evaluate
the performance of a binary diagnostic classification
method.
• Many diagnostic tests provide continuous results,
requiring a cut-off value for decision-making.
• ROC analysis helps determine the optimal cut-off
point
4. Introduction
• Purpose of ROC Curves
• Assess the overall diagnostic performance of a test.
• Compare multiple diagnostic tests.
• Determine the optimal cut-off value to balance
sensitivity and specificity.
5. TPES OF ROC Curves
Parametric (Binary) Method
• Forms a smooth curve by expanding sample size and
connecting countless points
• Compare plots at any sensitivity and specificity values
• AUC may be biased if normality assumptions are
incorrect
Assumption
• Normal distribution of data
Nahm, F. S. (2022). Receiver operating characteristic curve : overview
and practical use for clinicians.
6. TPES OF ROC Curves
Non-Parametric (Empirical)
Method
• Produces a jagged or staircase-like
curve.
• Uses all observed data points.
• No normality assumption required,
making it more robust.
7. Receiver Operating Characteristic (ROC) Curve
Rajeev, K. A. I. (2025). Receiver Operating Characteristic (ROC) Curve for Medical
Researchers.
8. Area under the curve (AUC)
• AUC summarizes (quantifies) the overall accuracy of
a diagnostic test.
• Values range from 0 to 1:
AUC=1 : perfectly accurate test
AUC > 0.8: Good-Excellent test
AUC 0.5: No discrimination (random guessing)
AUC < 0.5: Worse than random guessing
AUC=0: perfectly inaccurate test
9. Interpretation of the Area Under the Curve
Nahm, F. S. (2022). Receiver operating characteristic curve : overview and practical use for clinicians.
13. PARTIAL AUC
• (ROC) curves with an equal AUC
• Although the AUC is the same, the features
of the ROC curves are not identical.
• Test B shows better performance in the high
false-positive rate range than test A
• Test A is better in the low false-positive
range.
• In this example, the partial AUC (pAUC) can
compare these two ROC curves at a specific
false positive rate range.
Nahm, F. S. (2022). Receiver operating characteristic curve :
overview and practical use for clinicians.
14. EFFECT OF THRESHOLDS
• For the same diagnostic test sensitivity and specificity
vary with the thresholds used.
• Generally:
• High threshold: good specificity, medium sensitivity
• Medium threshold: medium specificity, medium
sensitivity
•
• Low threshold: good sensitivity, medium specificity
15. common interpretations of AUC
i. the average value of sensitivity for all possible values of
specificity
ii. the average value of specificity for all possible values of
sensitivity
iii. the probability that a randomly selected patient with disease
has positive test result that indicates greater suspicion than a
randomly selected patient without disease when higher values
of the test are associated with disease and lower values are
associated with non disease.
16. The ROC curve : advantages.
• Displays all possible cut-off points and one can read the optimal cut-
off
• Independent of diseases prevalence, unlike predictive values
(PPV,NPV), therefore, samples can be taken regardless of the
prevalence of a disease in the population
• Allows comparison of multiple tests in one graph
• Sometimes sensitivity is more important than specificity or vice
versa- ROC helps in finding the required value of sensitivity at fixed
values of specificity
• Useful summary of measures can be obtained for determining the
validity of diagnostic test such as AUC and partial area under the
curve
17. The ROC curve : Disadvantages.
• The cut-off value for distinguishing normal from
abnormal is not directly displayed on the ROC curve
and neither is the number of samples.
• The ROC curve appears more jagged with a smaller
sample size, a larger sample does not necessarily
result in a smoother curve.
18. Potential ROC issues
• Lack of gold standard for diagnosis
• Lack of reproducibility
–E.g., disagreement among pathologists
• Bias in sample selection, spectrum of disease used in evaluating test
–Choose sickest patients, healthy controls
• Problems in ascertainment
–Genetic disease may not be manifest
• Can’t always reliably measure ROC area (Few cases with disease,
19. Solutions to Potential ROC issues
• Lack of gold standard for diagnosis
• composite reference standard: combining multiple diagnostic criteria
clinical findings, laboratory tests, imaging results, or expert opinions
• Lack of reproducibility
Trainings.
Multiple independent raters and measure agreement (e.g., Cohen’s kappa).
Implement AI-assisted decision tools to reduce human variability
20. Solutions to Potential ROC issues
Bias in sample selection
• Ensure representative sampling
Problems in ascertainment
• Longitudinal follow-up to assess test performance over time.
• Implement repeat testing to detect cases that initially appear negative.
• Can’t always reliably measure ROC area
Report confidence intervals for AUC instead of a single point estimate.
Use bootstrapping to improve AUC estimation.
21. References
• Nahm, F. S. (2022). Receiver Operating Characteristic Curve:
Overview and Practical Use for Clinicians.
Editor's Notes
#6:For all cut-off values measured from the test results.
The stricter the criteria for determining a positive result, the more points on the curve shift downward and to the left
#7:A point on the ROC : sensitivity/specificity pair for a particular reference value (decision threshold)
A perfect discrimination (no overlap): ROC curve that passes through the upper left corner
Sensitivity = 1, specificity = 1
The overall accuracy of the test is higher as the ROC curve approaches the upper left corner
#9:The 95% CI values of AUC must be >0.5 to be statistically significant
#12:pAUC measures the area under a portion of the curve within a predefined range of false positive rates (FPR) or true positive rates (TPR).
Why Use Partial AUC?
In many real-world scenarios, focusing on the entire AUC may not be ideal. Instead, researchers or clinicians may care more about a specific range of false positives or true positives. pAUC is useful when:
Clinical relevance is limited to a specific range – Some screening tests should maintain a very high specificity (low false positive rate), so we only analyze a portion of the ROC curve.
Comparing models where only a certain range is important – Some classifiers may perform better in a specific region of interest rather than across the entire range.
Highly imbalanced datasets – When the cost of false positives is high, focusing on high-specificity regions (low FPR) can improve model selection.
Example Use Cases
Cancer Screening: If a test should have a high specificity (e.g., FPR ≤ 0.10), pAUC is computed only for FPR between 0 and 0.10.
Medical Diagnostics: Some diagnostic tests (e.g., HIV, tuberculosis) must perform well only in the high-sensitivity range.
#15:3. The Probability That a Randomly Selected Patient with Disease Has a Positive Test Result That Indicates Greater Suspicion Than a Randomly Selected Patient Without Disease
Explanation:
This is the most intuitive interpretation of AUC.
AUC can be defined as the probability that a randomly chosen diseased individual has a test result indicating greater suspicion than a randomly chosen non-diseased individual.
It is derived from the Wilcoxon-Mann-Whitney U statistic, which measures how well the test differentiates between the two groups.
Example:
If a cervical cancer screening test has AUC = 0.85, it means that 85% of the time, a woman with cervical cancer will have a higher test score than a woman without cervical cancer.
If AUC = 0.50, the test is no better than random chance