SlideShare a Scribd company logo
© American Academy of Neurology
Diagnostic and Screening Studies
Module 9
© American Academy of Neurology
Objectives
At the completion of this module participants should
• Describe the diagnostic process
• Be able to describe these concepts:
– Pre-test and post-test probability
– Sensitivity and specificity
– ROC curve
– Positive and negative predictive values
– Likelihood ratios
• Be able to evaluate the quality of a study that evaluates the
diagnostic accuracy of a test
– Index test & reference standard
– Sources of error in studies of the accuracy of diagnostic tests
© American Academy of Neurology
Screening
• Different than a Diagnostic test
– Bring forward in time of diagnosis
– Diagnosis prior to the onset of symptoms
– Case broad net to identify those at high risk for disease
– Often applied to a broader population with a lower a priori probability of
disease
• Minimize false negatives even if there are false positives
• Screening test used to “rule out” disease (i.e., they have a very
high sensitivity) whereas the goal of a diagnostic test is to “rule
in” disease (i.e. high specificity)
© American Academy of Neurology
Screening
• Screening tests are NOT used to diagnose any condition.
• Any positive screening test MUST be followed by a more
SPECIFIC test in order to make a diagnosis
• For example:
– VDRL (screening)
• Very Sensitive
• But many False Positives
– FTA (diagnostic)
© American Academy of Neurology
Diagnostic Methods
• Method of exhaustion
– Inefficient and impractical
• Pattern recognition
– Fails in complex and atypical situations
• Hypothetical-deductive reasoning
– Formulation of a probabilistic differential diagnosis
– Continuous refinement by incorporation of new data
– Requires coping with residual uncertainty
© American Academy of Neurology
Diagnostic Process
1. Estimate a pre-test probability
2. Decide whether a diagnostic test is required
– Has the ‘treatment threshold’ been crossed?
– Has the ‘test threshold’ been crossed?
3. Apply the test
4. Estimate a post-test probability
© American Academy of Neurology
Diagnostic Testing as an ‘Intervention’
Initial Clinical Data:
Rapidly Progressive Dementia
? CJD
Negative: Revisit Differential Positive: Dx = CJD
Test [Brain MRI]
Assign Pre-Test Probability
Assign Post-Test Probability
© American Academy of Neurology
Pre-Test Probability
• The estimated likelihood of a specific diagnosis prior
to application of a diagnostic test
• How do we put a number on it?
• Experience
• Prevalence data
© American Academy of Neurology
Pre-test probability of CJD
• Case A: 80 year-old man with 2-
year history of memory decline
and twitching
• Case B: 60 year-old woman with
1-year history of cognitive
decline and imbalance
• Case C: 45 year-old man with 4-
month history of cognitive
decline, myoclonus, and ataxia
Low High
A B C
Probability of CJD
© American Academy of Neurology
Treatment Threshold
Do you need a diagnostic test?
• Has the “treatment threshold” been crossed?
– Is the provisional diagnosis so likely that we can move on
to treatment/management?
– If yes, further testing is not necessary
© American Academy of Neurology
Test Threshold
Do you need a diagnostic test?
• Has the “test threshold” been crossed?
– Is a specific diagnosis deemed so unlikely that we can
comfortably dismiss it?
– If no, then further testing is necessary
© American Academy of Neurology
Post-Test Probability
• Start with a reasonable estimate of pre-test
probability
• Apply an accurate diagnostic test
• Use the combined information from the pre-test
probability and the accuracy of the diagnostic test, to
estimate the post-test probability
• Fagan’s nomogram illustrates this concept
© American Academy of Neurology
Fagan’s Nomogram
© American Academy of Neurology
Post-Test Decision-Making
• How does the post-test probability influence clinical
decision-making?
– Is the test precise enough?
– Can you be confident of the test interpretation
(agreement)?
– Do you need another test?
– Can you move on to treatment?
© American Academy of Neurology
Development / Evaluation of
Diagnostic Test
PICO Model
– P - patient description
– I - intervention (index/diagnostic test)
– C - comparison (gold or reference standard)
– O - outcome (final diagnosis)
© American Academy of Neurology
P.I.C.O. - CJD as an example
P - Patient with rapidly progressive dementia
I - Diagnostic Test” or “Index Test”:
–Basal ganglia hyperintensity on brain MRI
C - Reference Standard” or “Gold Standard”:
–WHO Criteria for CJD
O - Diagnosis of CJD
© American Academy of Neurology
P.I.C.O. - An Answerable Question
In patients with rapidly progressive
dementia, how accurate is assessment of
basal ganglia MRI hyperintensity, compared
with WHO diagnostic criteria, for diagnosis
of Creutzfeldt-Jacob disease (CJD)?
© American Academy of Neurology
Intervention: Index Test
• The test that will be used in clinical practice to
differentiate the condition of interest from some
other state
• Ideal characteristics of the index test
– Accurate (compared with a reference standard)
– Precise
– Available
– Convenient
– Low risk
– Inexpensive
– Reproducible
• Should be independent of the reference standard
© American Academy of Neurology
Reference Standard
• Gold standard or the ‘truth’
• The best available procedure, method or criteria used
to establish the presence or absence of the condition
of interest
• Should be independent of the index test
• Why not just use the reference standard?
– Unavailable (e.g. autopsy)
– Risky (e.g. invasive procedure)
– Expensive (e.g. new technology)
© American Academy of Neurology
Diagnostic Test Metrics
• How do we quantify / measure diagnostic test accuracy?
• Magnitude of the effect
– Sensitivity and Specificity
– Positive & Negative Predictive Values
– Positive & Negative Likelihood Ratios
• Precision
– Confidence Intervals
• Reproducibility
• Most can be derived from our old friend … the 2x2 table
© American Academy of Neurology
2 x 2 Table
Reference Standard
“the truth”
Disease No Disease
Index Test
Positive TP FP
Negative FN TN
© American Academy of Neurology
Sensitivity
Sensitivity positivity
in disease
TP / (TP+FN)
A Negative test that
has a High Sensitivity
(i.e., almost no false
negatives) helps rule
out the disease
(TP + FN)
Reference Standard
Disease No Disease
Index
Test
Positive TP FP
Negative FN TN
© American Academy of Neurology
Sensitivity: Examples
• CSF oligoclonal banding for MS: 85-90%
• Head CT for detection of acute SAH: 90-95%
• Jolt accentuation of headache in acute
bacterial meningitis: 100%
© American Academy of Neurology
Specificity
Specificity negativity
in no disease
TN / (TN+FP)
A Positive Test that has
a High Specificity (i.e.,
almost no false
negatives helps rule in
the disease
(TN + FP)
Reference Standard
Disease No Disease
Index
Test
Positive TP FP
Negative FN TN
© American Academy of Neurology
Specificity: Examples
• MRI for acute ischemic stroke <3h: 92%
• Anticholinergic receptor antibodies in MG: 99%
• MRI for acute hemorrhagic stroke: 99-100%
• Oculomasticatory myorhythmia in Whipple’s
disease: 100%
© American Academy of Neurology
Receiver Operator Characteristic
(ROC) Curves
• Plot of sensitivity vs (1-
specificity)
• ‘Trade off’ between
sensitivity and specificity
• ‘Trade off’ between true
positives and false positives
• 45° line - test with no
discriminative value
Test with no
diagnostic information
Optimal
test accuracy
© American Academy of Neurology
True Positives
False Positives
Cutoff Value
© American Academy of Neurology
True Positives
False Positives
Cutoff Value
© American Academy of Neurology
True Positives
False Positives
Cutoff Value
© American Academy of Neurology
Reference Standard
Disease No Disease
Index
Test
Positive TP FP
Negative FN TN
Positive & Negative Predictive Value
PPV disease
amongst those with a
positive index test
TP/ (TP+FP)
NPV no
disease amongst those
with a negative index
test
TN/ (TN+FN)
© American Academy of Neurology
Baye’s Theorem and the Predictive Value of a
Positive Test
• The probability of a test demonstrating a true positive
depends not only on the sensitivity and specificity of a test,
but also on the prevalence of the disease in the population
being studied.
• The chance of a positive test being a true positive is markedly
higher in a population with a high prevalence of disease.
• In contrast, if a very sensitive and specific test is applied to a
population with a very low prevalence of disease, most
positive tests will actually be false positives.
Module 9 clinical labs values edited.ppt
Module 9 clinical labs values edited.ppt
© American Academy of Neurology
Baye’s Theorem
Prevalence of Positive Predictive
Condition (%) Value of a Positive Test (%)*
75 98
50 95
20 83
10 68
5 50
1 16
0.1 2
* 95% sensitivity and 95% specificity
© American Academy of Neurology
Baye’s Theorem
• In the example where the disease prevalence is 1% and the
test has a 95% sensitivity and 95% specificity, the predictive
value that a positive test is a true positive = 16.1%.
• This means that 83.9% of the positive results will actually be
false! In this setting, a highly sensitive and specific test is of
absolutely no value.
• Using Positive Predictive Values derived in one setting will
likely not be valid in another setting with different disease
prevalence.
• This can be better addressed using Likelihood Ratios
© American Academy of Neurology
Likelihood Ratios
• Most clinically useful measures
© American Academy of Neurology
Likelihood Ratio Interpretation
Likelihood
Ratio
Change in
Probability
Clinical
Importance
> 10 or < 0.1 Large Often very high
5-10 or 0.1-0.2 Moderate Moderately high
2-5 or 0.2-0.5 Small Sometimes
1-2 or 0.5-1.0 Very small Rare
© American Academy of Neurology
Estimate a Post-Test Probability
© American Academy of Neurology
Sources of Bias
• Spectrum Bias
• Verification Bias
• Independence
• Incorporation bias
• Blinding
© American Academy of Neurology
Precision
• Random Error
– Insufficiently precise estimates of test accuracy
• Random error may be quantified statistically
with confidence intervals
– 95% is standard
– Smaller interval (more precision) with larger
sample size
© American Academy of Neurology
© American Academy of Neurology
Agreement
• Many tests require observer interpretation
• Clinical utility and generalizability are affected
by the inter-observer agreement
– Agreement above chance
– Measured by kappa (κ) statistic
© American Academy of Neurology
STAR-D
• STAndards for the Reporting of Diagnostic accuracy
studies
• Consensus document summarizing reporting
requirements for diagnostic accuracy studies
• 25-item checklist and flow-diagram
• http://www.stard-statement.org/
© American Academy of Neurology
STARD Checklist
© American Academy of Neurology
STARD Checklist
© American Academy of Neurology
Summary
• Recognize that diagnosis is usually achieved using
hypothetical-deductive methods
• Formulate an appropriate diagnostic question when
considering use of a test
• The clinical importance of a test result is determined by
both the pretest probability of the disease and test accuracy
• Diagnostic test accuracy is best expressed using LR and 95%
CI
• STAR-D criteria can assist in appraisal of the methods used
to evaluate a diagnostic test
© American Academy of Neurology
References
• http://www.stard-statement.org/
• Fagan TH. Nomogram for Bayes’s theorem. N Engl J Med 1975;293:257.
• Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating
efficacy of diagnostic tests. N Engl J Med 1978;299;926-930
• Reid MC, Lachs MS, Feinstein AR. Use of methodological standards in
diagnostic test research: getting better but still not good. JAMA
1995;274:645-651

More Related Content

PPTX
Dr S P Mirajkar Evaluation of Diagnostic tests.pptx
PPTX
screening-EPI.pptxgjgggggghggggggggggggg
PDF
screening-140217071714-phpapp02.pdf
PPTX
Neurological. exam oxford bidaki 99 6 bahman
PPTX
screening for diseases.pptx . ...
PDF
Focused Neurological Assessment
PDF
[Workshop] The science of screening in Psycho-oncology (Oct10)
PPTX
Lecture 11 Neurologic system disorders.pptx
Dr S P Mirajkar Evaluation of Diagnostic tests.pptx
screening-EPI.pptxgjgggggghggggggggggggg
screening-140217071714-phpapp02.pdf
Neurological. exam oxford bidaki 99 6 bahman
screening for diseases.pptx . ...
Focused Neurological Assessment
[Workshop] The science of screening in Psycho-oncology (Oct10)
Lecture 11 Neurologic system disorders.pptx

Similar to Module 9 clinical labs values edited.ppt (20)

PDF
Neurologic Nursing 1
PPTX
evalution for screening test and its importance .pptx
PPTX
PDF
GIGO Problems With AI.pdf
PPTX
Neurologic examination TZ BEST.pptx
PPTX
Presentation on neurological assessment- By Aishwarya Patil P.T.
PPTX
Introduction to Diagnostics.pptx and treatment for diseases
PDF
pitchdeck 2023 - Project Ipsilon BV.pdf
PPTX
1.Diagnostic reasoning and localization of neurologic disorders 2017.pptx
PPTX
Role of Biomarkers in Alzheimers Disease
PPTX
neuro assessment.pptx neuro disorders assessment
PPT
Evidence Based Diagnosis
PPT
PD Inservice
PPT
Diagnostic testing 2009
PPTX
diagnosis and types of diagnosis
PDF
Test Bank for Advanced Assessment Interpreting Findings and Formulating Diffe...
PPT
Marilyn Albert
PPTX
Neurological System Examination. Khan MS
PPTX
Neurological assessment
PDF
Probability.pdf.pdf and Statistics for R
Neurologic Nursing 1
evalution for screening test and its importance .pptx
GIGO Problems With AI.pdf
Neurologic examination TZ BEST.pptx
Presentation on neurological assessment- By Aishwarya Patil P.T.
Introduction to Diagnostics.pptx and treatment for diseases
pitchdeck 2023 - Project Ipsilon BV.pdf
1.Diagnostic reasoning and localization of neurologic disorders 2017.pptx
Role of Biomarkers in Alzheimers Disease
neuro assessment.pptx neuro disorders assessment
Evidence Based Diagnosis
PD Inservice
Diagnostic testing 2009
diagnosis and types of diagnosis
Test Bank for Advanced Assessment Interpreting Findings and Formulating Diffe...
Marilyn Albert
Neurological System Examination. Khan MS
Neurological assessment
Probability.pdf.pdf and Statistics for R
Ad

Recently uploaded (20)

PPTX
Important Obstetric Emergency that must be recognised
PDF
Handout_ NURS 220 Topic 10-Abnormal Pregnancy.pdf
PPTX
ACID BASE management, base deficit correction
PPT
Breast Cancer management for medicsl student.ppt
PPTX
CEREBROVASCULAR DISORDER.POWERPOINT PRESENTATIONx
PPTX
Slider: TOC sampling methods for cleaning validation
PPT
Management of Acute Kidney Injury at LAUTECH
PPTX
Chapter-1-The-Human-Body-Orientation-Edited-55-slides.pptx
PPTX
JUVENILE NASOPHARYNGEAL ANGIOFIBROMA.pptx
DOC
Adobe Premiere Pro CC Crack With Serial Key Full Free Download 2025
PPTX
1 General Principles of Radiotherapy.pptx
PDF
Khadir.pdf Acacia catechu drug Ayurvedic medicine
PPTX
CME 2 Acute Chest Pain preentation for education
PPTX
Respiratory drugs, drugs acting on the respi system
PPTX
ca esophagus molecula biology detailaed molecular biology of tumors of esophagus
PDF
Medical Evidence in the Criminal Justice Delivery System in.pdf
PPT
ASRH Presentation for students and teachers 2770633.ppt
DOCX
RUHS II MBBS Microbiology Paper-II with Answer Key | 6th August 2025 (New Sch...
PPTX
surgery guide for USMLE step 2-part 1.pptx
PDF
Intl J Gynecology Obste - 2021 - Melamed - FIGO International Federation o...
Important Obstetric Emergency that must be recognised
Handout_ NURS 220 Topic 10-Abnormal Pregnancy.pdf
ACID BASE management, base deficit correction
Breast Cancer management for medicsl student.ppt
CEREBROVASCULAR DISORDER.POWERPOINT PRESENTATIONx
Slider: TOC sampling methods for cleaning validation
Management of Acute Kidney Injury at LAUTECH
Chapter-1-The-Human-Body-Orientation-Edited-55-slides.pptx
JUVENILE NASOPHARYNGEAL ANGIOFIBROMA.pptx
Adobe Premiere Pro CC Crack With Serial Key Full Free Download 2025
1 General Principles of Radiotherapy.pptx
Khadir.pdf Acacia catechu drug Ayurvedic medicine
CME 2 Acute Chest Pain preentation for education
Respiratory drugs, drugs acting on the respi system
ca esophagus molecula biology detailaed molecular biology of tumors of esophagus
Medical Evidence in the Criminal Justice Delivery System in.pdf
ASRH Presentation for students and teachers 2770633.ppt
RUHS II MBBS Microbiology Paper-II with Answer Key | 6th August 2025 (New Sch...
surgery guide for USMLE step 2-part 1.pptx
Intl J Gynecology Obste - 2021 - Melamed - FIGO International Federation o...
Ad

Module 9 clinical labs values edited.ppt

  • 1. © American Academy of Neurology Diagnostic and Screening Studies Module 9
  • 2. © American Academy of Neurology Objectives At the completion of this module participants should • Describe the diagnostic process • Be able to describe these concepts: – Pre-test and post-test probability – Sensitivity and specificity – ROC curve – Positive and negative predictive values – Likelihood ratios • Be able to evaluate the quality of a study that evaluates the diagnostic accuracy of a test – Index test & reference standard – Sources of error in studies of the accuracy of diagnostic tests
  • 3. © American Academy of Neurology Screening • Different than a Diagnostic test – Bring forward in time of diagnosis – Diagnosis prior to the onset of symptoms – Case broad net to identify those at high risk for disease – Often applied to a broader population with a lower a priori probability of disease • Minimize false negatives even if there are false positives • Screening test used to “rule out” disease (i.e., they have a very high sensitivity) whereas the goal of a diagnostic test is to “rule in” disease (i.e. high specificity)
  • 4. © American Academy of Neurology Screening • Screening tests are NOT used to diagnose any condition. • Any positive screening test MUST be followed by a more SPECIFIC test in order to make a diagnosis • For example: – VDRL (screening) • Very Sensitive • But many False Positives – FTA (diagnostic)
  • 5. © American Academy of Neurology Diagnostic Methods • Method of exhaustion – Inefficient and impractical • Pattern recognition – Fails in complex and atypical situations • Hypothetical-deductive reasoning – Formulation of a probabilistic differential diagnosis – Continuous refinement by incorporation of new data – Requires coping with residual uncertainty
  • 6. © American Academy of Neurology Diagnostic Process 1. Estimate a pre-test probability 2. Decide whether a diagnostic test is required – Has the ‘treatment threshold’ been crossed? – Has the ‘test threshold’ been crossed? 3. Apply the test 4. Estimate a post-test probability
  • 7. © American Academy of Neurology Diagnostic Testing as an ‘Intervention’ Initial Clinical Data: Rapidly Progressive Dementia ? CJD Negative: Revisit Differential Positive: Dx = CJD Test [Brain MRI] Assign Pre-Test Probability Assign Post-Test Probability
  • 8. © American Academy of Neurology Pre-Test Probability • The estimated likelihood of a specific diagnosis prior to application of a diagnostic test • How do we put a number on it? • Experience • Prevalence data
  • 9. © American Academy of Neurology Pre-test probability of CJD • Case A: 80 year-old man with 2- year history of memory decline and twitching • Case B: 60 year-old woman with 1-year history of cognitive decline and imbalance • Case C: 45 year-old man with 4- month history of cognitive decline, myoclonus, and ataxia Low High A B C Probability of CJD
  • 10. © American Academy of Neurology Treatment Threshold Do you need a diagnostic test? • Has the “treatment threshold” been crossed? – Is the provisional diagnosis so likely that we can move on to treatment/management? – If yes, further testing is not necessary
  • 11. © American Academy of Neurology Test Threshold Do you need a diagnostic test? • Has the “test threshold” been crossed? – Is a specific diagnosis deemed so unlikely that we can comfortably dismiss it? – If no, then further testing is necessary
  • 12. © American Academy of Neurology Post-Test Probability • Start with a reasonable estimate of pre-test probability • Apply an accurate diagnostic test • Use the combined information from the pre-test probability and the accuracy of the diagnostic test, to estimate the post-test probability • Fagan’s nomogram illustrates this concept
  • 13. © American Academy of Neurology Fagan’s Nomogram
  • 14. © American Academy of Neurology Post-Test Decision-Making • How does the post-test probability influence clinical decision-making? – Is the test precise enough? – Can you be confident of the test interpretation (agreement)? – Do you need another test? – Can you move on to treatment?
  • 15. © American Academy of Neurology Development / Evaluation of Diagnostic Test PICO Model – P - patient description – I - intervention (index/diagnostic test) – C - comparison (gold or reference standard) – O - outcome (final diagnosis)
  • 16. © American Academy of Neurology P.I.C.O. - CJD as an example P - Patient with rapidly progressive dementia I - Diagnostic Test” or “Index Test”: –Basal ganglia hyperintensity on brain MRI C - Reference Standard” or “Gold Standard”: –WHO Criteria for CJD O - Diagnosis of CJD
  • 17. © American Academy of Neurology P.I.C.O. - An Answerable Question In patients with rapidly progressive dementia, how accurate is assessment of basal ganglia MRI hyperintensity, compared with WHO diagnostic criteria, for diagnosis of Creutzfeldt-Jacob disease (CJD)?
  • 18. © American Academy of Neurology Intervention: Index Test • The test that will be used in clinical practice to differentiate the condition of interest from some other state • Ideal characteristics of the index test – Accurate (compared with a reference standard) – Precise – Available – Convenient – Low risk – Inexpensive – Reproducible • Should be independent of the reference standard
  • 19. © American Academy of Neurology Reference Standard • Gold standard or the ‘truth’ • The best available procedure, method or criteria used to establish the presence or absence of the condition of interest • Should be independent of the index test • Why not just use the reference standard? – Unavailable (e.g. autopsy) – Risky (e.g. invasive procedure) – Expensive (e.g. new technology)
  • 20. © American Academy of Neurology Diagnostic Test Metrics • How do we quantify / measure diagnostic test accuracy? • Magnitude of the effect – Sensitivity and Specificity – Positive & Negative Predictive Values – Positive & Negative Likelihood Ratios • Precision – Confidence Intervals • Reproducibility • Most can be derived from our old friend … the 2x2 table
  • 21. © American Academy of Neurology 2 x 2 Table Reference Standard “the truth” Disease No Disease Index Test Positive TP FP Negative FN TN
  • 22. © American Academy of Neurology Sensitivity Sensitivity positivity in disease TP / (TP+FN) A Negative test that has a High Sensitivity (i.e., almost no false negatives) helps rule out the disease (TP + FN) Reference Standard Disease No Disease Index Test Positive TP FP Negative FN TN
  • 23. © American Academy of Neurology Sensitivity: Examples • CSF oligoclonal banding for MS: 85-90% • Head CT for detection of acute SAH: 90-95% • Jolt accentuation of headache in acute bacterial meningitis: 100%
  • 24. © American Academy of Neurology Specificity Specificity negativity in no disease TN / (TN+FP) A Positive Test that has a High Specificity (i.e., almost no false negatives helps rule in the disease (TN + FP) Reference Standard Disease No Disease Index Test Positive TP FP Negative FN TN
  • 25. © American Academy of Neurology Specificity: Examples • MRI for acute ischemic stroke <3h: 92% • Anticholinergic receptor antibodies in MG: 99% • MRI for acute hemorrhagic stroke: 99-100% • Oculomasticatory myorhythmia in Whipple’s disease: 100%
  • 26. © American Academy of Neurology Receiver Operator Characteristic (ROC) Curves • Plot of sensitivity vs (1- specificity) • ‘Trade off’ between sensitivity and specificity • ‘Trade off’ between true positives and false positives • 45° line - test with no discriminative value Test with no diagnostic information Optimal test accuracy
  • 27. © American Academy of Neurology True Positives False Positives Cutoff Value
  • 28. © American Academy of Neurology True Positives False Positives Cutoff Value
  • 29. © American Academy of Neurology True Positives False Positives Cutoff Value
  • 30. © American Academy of Neurology Reference Standard Disease No Disease Index Test Positive TP FP Negative FN TN Positive & Negative Predictive Value PPV disease amongst those with a positive index test TP/ (TP+FP) NPV no disease amongst those with a negative index test TN/ (TN+FN)
  • 31. © American Academy of Neurology Baye’s Theorem and the Predictive Value of a Positive Test • The probability of a test demonstrating a true positive depends not only on the sensitivity and specificity of a test, but also on the prevalence of the disease in the population being studied. • The chance of a positive test being a true positive is markedly higher in a population with a high prevalence of disease. • In contrast, if a very sensitive and specific test is applied to a population with a very low prevalence of disease, most positive tests will actually be false positives.
  • 34. © American Academy of Neurology Baye’s Theorem Prevalence of Positive Predictive Condition (%) Value of a Positive Test (%)* 75 98 50 95 20 83 10 68 5 50 1 16 0.1 2 * 95% sensitivity and 95% specificity
  • 35. © American Academy of Neurology Baye’s Theorem • In the example where the disease prevalence is 1% and the test has a 95% sensitivity and 95% specificity, the predictive value that a positive test is a true positive = 16.1%. • This means that 83.9% of the positive results will actually be false! In this setting, a highly sensitive and specific test is of absolutely no value. • Using Positive Predictive Values derived in one setting will likely not be valid in another setting with different disease prevalence. • This can be better addressed using Likelihood Ratios
  • 36. © American Academy of Neurology Likelihood Ratios • Most clinically useful measures
  • 37. © American Academy of Neurology Likelihood Ratio Interpretation Likelihood Ratio Change in Probability Clinical Importance > 10 or < 0.1 Large Often very high 5-10 or 0.1-0.2 Moderate Moderately high 2-5 or 0.2-0.5 Small Sometimes 1-2 or 0.5-1.0 Very small Rare
  • 38. © American Academy of Neurology Estimate a Post-Test Probability
  • 39. © American Academy of Neurology Sources of Bias • Spectrum Bias • Verification Bias • Independence • Incorporation bias • Blinding
  • 40. © American Academy of Neurology Precision • Random Error – Insufficiently precise estimates of test accuracy • Random error may be quantified statistically with confidence intervals – 95% is standard – Smaller interval (more precision) with larger sample size
  • 41. © American Academy of Neurology
  • 42. © American Academy of Neurology Agreement • Many tests require observer interpretation • Clinical utility and generalizability are affected by the inter-observer agreement – Agreement above chance – Measured by kappa (κ) statistic
  • 43. © American Academy of Neurology STAR-D • STAndards for the Reporting of Diagnostic accuracy studies • Consensus document summarizing reporting requirements for diagnostic accuracy studies • 25-item checklist and flow-diagram • http://www.stard-statement.org/
  • 44. © American Academy of Neurology STARD Checklist
  • 45. © American Academy of Neurology STARD Checklist
  • 46. © American Academy of Neurology Summary • Recognize that diagnosis is usually achieved using hypothetical-deductive methods • Formulate an appropriate diagnostic question when considering use of a test • The clinical importance of a test result is determined by both the pretest probability of the disease and test accuracy • Diagnostic test accuracy is best expressed using LR and 95% CI • STAR-D criteria can assist in appraisal of the methods used to evaluate a diagnostic test
  • 47. © American Academy of Neurology References • http://www.stard-statement.org/ • Fagan TH. Nomogram for Bayes’s theorem. N Engl J Med 1975;293:257. • Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating efficacy of diagnostic tests. N Engl J Med 1978;299;926-930 • Reid MC, Lachs MS, Feinstein AR. Use of methodological standards in diagnostic test research: getting better but still not good. JAMA 1995;274:645-651

Editor's Notes

  • #3: The primary focus of this module is on concepts relevant to diagnosis It is necessary, however, to distinguish the concepts of diagnosis and screening and so we begin with a few words about each There are different ways to conceptualize the differences between diagnosis and screening On one account, screening is the process whereby an attempt is made to bring forward the time of diagnosis from a time when symptoms are present, to a time before symptoms emerge By another account, the goal of screening is to cast a broad net and to identify a group of people who are at a high risk for developing a particular disease (e.g. a group of people who possess a risk factor for the disease of interest) Since a screening testing is typically applied to a broader population than is a diagnostic test, the a priori probability of disease is generally lower when applying a screening test than when applying a diagnostic test Similarly, since the goal of a screening test is often to identify a group of people at high risk for disease, there is a preference to minimize false negatives even at the expense of false positives; what this means is that the threshold for defining a test as abnormal will typically be set at a lower point than it would be for a diagnostic test Another way to conceptualize this is to think of a screening test as being used to “rule out” disease (i.e. high sensitivity) whereas the goal of a diagnostic test is to “rule in” disease (i.e. high specificity)
  • #4: The primary focus of this module is on concepts relevant to diagnosis It is necessary, however, to distinguish the concepts of diagnosis and screening and so we begin with a few words about each There are different ways to conceptualize the differences between diagnosis and screening On one account, screening is the process whereby an attempt is made to bring forward the time of diagnosis from a time when symptoms are present, to a time before symptoms emerge By another account, the goal of screening is to cast a broad net and to identify a group of people who are at a high risk for developing a particular disease (e.g. a group of people who possess a risk factor for the disease of interest) Since a screening testing is typically applied to a broader population than is a diagnostic test, the a priori probability of disease is generally lower when applying a screening test than when applying a diagnostic test Similarly, since the goal of a screening test is often to identify a group of people at high risk for disease, there is a preference to minimize false negatives even at the expense of false positives; what this means is that the threshold for defining a test as abnormal will typically be set at a lower point than it would be for a diagnostic test Another way to conceptualize this is to think of a screening test as being used to “rule out” disease (i.e. high sensitivity) whereas the goal of a diagnostic test is to “rule in” disease (i.e. high specificity) Examples of screening tests in neurology – identification of asymptomatic cerebral aneurysms in family members of patients who have had a subarachnoid hemorrhage
  • #5: There are several methods of achieving a clinical diagnosis but three methods are the most common The “method of exhaustion” is what we are taught in medical school. All possibilities are considered, without regard for whether certain diagnoses are more likely than others. This is obviously inefficient and impractical. Pattern recognition plays an important role. Experienced clinicians can recognize typical patterns of common or rare but distinct conditions. However, this method tends to break down in complex or atypical cases. Most clinicians use hypothetical-deductive reasoning. This complex process, which is performed almost subconsciously by expert clinicians, involves assessment of the relative probability of different competing diagnoses based on available data Therefore, a diagnosis is typically reached after an iterative process of gathering data, whether by history and examination or by performing other diagnostic tests (laboratory studies, neuroimaging, etc), and continuous reassessment of the probability of competing diagnoses.
  • #6: Estimate a pre-test probability Decide whether a diagnostic test is required Has the ‘treatment threshold’ has been crossed? Has the ‘test threshold’ been crossed? Implement a diagnostic test (the ‘intervention’) Determine the post-test probability
  • #7: Another way to conceptualize the diagnostic process is to think about the diagnostic test as an intervention The intervention influnces the probability of disease as well as moves from the pre-test phase to the post-test phase Here we illustrate this idea using the example of brain MRI for the diagnosis of CJD Note how we begin with a clinical problem of rapidly progressive dementia; we suspect CJD and assign a pre-test probability; we then apply the test (brain MRI) and calculate a post-test probability If the post-test probability is high, we conclude that CJD is present; if the post-test probability is low we revisit the differential diagnosis of consider applying a second diagnostic test
  • #8: The pre-test probability is the estimated likelihood of a specific diagnosis prior to application of a diagnostic test Clinicians are not used to quantifying this estimate. How can we put a number on it? Experience helps. For example, a “likely” diagnosis might be associated with an 80% probability whereas an “unlikely” consideration might warrant a 10 or 20% estimate Other possible sources of quantitative pre-test probability data include studies of disease prevalence and diagnostic test accuracy Example: Migraine is much more common than subarachnoid haemorrhage (SAH) as a cause of headache in the general population. However, when a patient is seen in the emergency room with sudden onset of a severe headache that is maximal in intensity at onset, SAH is a more likely diagnosis. This is because SAH is more a more common cause of headache amongst the restricted population of patients presenting to the ER with a thunderclap headache Example: Lyme disease as a cause of Bell’s palsy. Lyme disease may be a much more likely cause for a peripheral facial neuropathy in a patient seen in Connecticut than in a patient seen in Georgia
  • #9: These three cases serve to illustrate how simple clinical information allows a clinician to assign a probability estimate for the presence of a specific diagnosis
  • #10: Sometimes the diagnosis is so clear that one can move directly from the estimate of pre-test probability to treatment considerations I.e. if the provisional diagnosis is so likely based simply on the pre-test probability, then there may be little or no need for a diagnostic test If the pre-test probability remains below the ‘treatment threshold’ (i.e., you are not certain enough about the diagnosis to initiate treatment), then a diagnostic test is needed Uncertainty may remain for several reasons, including an atypical clinical presentation, insufficient data to fulfill diagnostic criteria, other feasible competing diagnoses, or the perceived risk associated with an incorrect diagnosis
  • #11: Sometimes the pre-test probability of a condition may be so low that the suspected disorder isn’t really a serious consideration If not, then no diagnostic testing is required If so, then the pre-test probability is above the “test threshold” and further diagnostic testing is required
  • #12: Begin with the pre-test probability on the left side Diagnostic tests are associated with a likelihood ratio (defined before) – a measure of how much they influence or leverage the post-test probability
  • #13: A line drawn through these two points and extended to the right column defines the post-test probability In the example, the test in question converted a 70% pre-test probability to a 98% post-test probability, effectively “ruling-in” the diagnosis and allowing transition from diagnosis to management. Note that the middle column (the “leverage” of the diagnostic test) and the distribution of the pre- and post-test probabilities are logarithmic, therefore, effective diagnostic tests have the greatest influence on situations where the diagnosis is uncertain (for example, pre-test probabilities of 30-70%), which is exactly the clinical situation in which accurate tests are necessary.
  • #14: Once a diagnostic test is judged to be valid, its use and interpretation results in several questions: How does the post-test probability influence clinical decision-making? Is the test precise enough (confidence intervals)? Can you be confident about the interpretation of the test (agreement)? Do you need another test? Can you move on to treatment?
  • #15: The ‘PICO’ model is a useful construct for conceptualizing the process of developing / evaluating a diagnostic test For diagnostic testing, the PICO structure is as follows: P = Patient description Usually a description of a clinical syndrome Include applicable relevant factors, e.g., age group, co-morbidities, etc. I = Intervention Diagnostic test of interest or “index test” This is the test that you are interested in using to achieve a better estimate of the likelihood of the diagnosis of interest in your patient C = Comparison intervention Reference standard or “gold standard” Generally accepted procedure or criteria for the outcome of interest This is the benchmark for evaluating new diagnostic tests O = Outcome Target diagnosis of interest
  • #16: This example utilizes the PICO model to develop a focused, answerable clinical question about achieving a specific diagnosis in a familiar neurological setting: a patient with a rapidly progressive dementing syndrome P: Patient with rapidly progressive dementia I: “Diagnostic Test” or “Index Test” BG signal abnormality on brain MRI C: “Reference Standard” or “Gold Standard”: World Health Organization diagnostic criteria for CJD O: Diagnosis of CJD
  • #17: Simply read this slide to ask the clinical question
  • #18: In selecting an index test to evaluate, it is helpful to bear in mind the characteristics that this test should ideally possess It should be Accurate (compared with a reference standard) Precise (small degree of random error associated with estimates of test accuracy) Available (e.g. not only available at tertiary referral centers) Convenient (e.g. not overly time-consuming) Low Risk (e.g. does not require radiation exposure) Inexpensive Reproducible (i.e. high degree of agreement within the same observer from test-to-test and between observers)
  • #19: Also known as the “gold standard” or (hopefully) ‘truth’ Generally accepted and validated procedure or criteria used to establish diagnosis The test you are considering using may be the reference standard However, the reference standard may not be available or feasible (e.g. autopsy), or it may be risky (interventional procedure) or expensive (new technology) If your test is not the reference standard, look for good quality evidence that establishes its accuracy against the reference standard
  • #20: How can we define and quantify accuracy? There are several accuracy measures, each with advantages and disadvantages. Magnitude of effect Sensitivity and specificity Predictive values (positive and negative) Likelihood ratios Precision Confidence intervals Reproducibility (agreement) For dichotomous outcomes, most of these measures can be derived from a simple 2x2 table.
  • #21: This 2 x 2 table compares the performance of the index test against the reference of the gold standard TP = true positive (positive index test in the face of disease) TN = true negative (negative test in the face of no disease) FN = false negative (negative test in the face of disease) FP = false positive (positive test in the face of no disease
  • #22: Sensitivity describes ‘positivity in disease’ Sensitivity – among people who have the disease, the proportion who test positive Sensitivity is defined as (true positive) / (true positive + false negative) A test with high sensitivity is useful for ruling out disease This may seem counter-intuitive, but think of it this way A test with high sensitivity has a low false negative rate If the false negative rate is low, then a negative test is much more likely to be a true negative Hence, a negative result from a test with high sensitivity is useful for excluding or ruling out a disease
  • #24: Specificity describes ‘negativity in no disease’ Specificity – among people who do not have the disease, the proportion who test negative Specificity is defined as (true negative) / (true negative + false positive) A test with high specificity is useful for ruling in disease This may seem counter-intuitive, but think of it this way A test with high specificity has a low false positive rate If the false positive rate is low, then a positive test is much more likely to be a true positive Hence, a positive result from a test with high specificity is useful for confirming or ruling in a disease
  • #26: A receiver operating characteristic (ROC) curve graphically (visually) illustrates the trade off between sensitivity and specificity It is constructed by plotting sensitivity on the y-axis and 1-specificity on the x-axis Sensitivity represents the true positive rate 1-specificity represents the false positive rate (i.e. 1-the true negative rate) The ROC curve is constructed by varying the threshold (cut-point) used to define the test as abnormal; sensitivity and specificity are calculated for each cut point and then plotted to generate the ROC curve The 45° line represents the test with no discriminative value – i.e. the trade off between sensitivity and specificity is equivalent; therefore, diagnostic accuracy is not improved by varying the cut-point The ideal diagnostic test is located towards the upper left-hand corner of the ROC curve – i.e. specificity remains at 100% as the cut-point is varied until sensitivity reaches 100%
  • #27: In this slide (and the two that follow) we try to illustrate the trade-off between sensitivity and specificity in a different way The figure on the left hand side of the slide shows two overlapping distributions – the red curve represents the distribution of test results from a healthy population and the blue curve represents the distribution of test results from a disease population Note that the distributions overlap – that is to say, the values of the test result in question are overlapping between the normal and the disease population It is very likely that it will always be the case (as few tests perfectly discriminate between health and disease), but the degree of overlap will vary The vertical black line is an arbitrarily chosen cut-point (a value of the test result) which we plan to use to differentiate between health and disease People whose test result values lie to the left of the black line beneath the red curve are “true negatives” – i.e. they do not have disease and the test result is negative People whose rest result values lie to the right of the black line beneath the red curve are “false positives” – i.e. they do not have disease, but the test result is positive People whose test results fall to the right of the black line beneath the blue distribution are “true positives” The two figures to the right of the slide show the two distributions separately (i.e. not overlaid) to further clarify how a single cut point (the dashed black line) identifies a certain proportion of true positives and a certain proportion of false positives Remember that the proportion of true positives is also known as sensitivity and … The proportion of false positives is also known as (1-specificity)
  • #28: Here we have shifted the cut-point In so doing, we have reduced the number of false positives (the red shaded area), but at the expense of also reducing the number of true positives (the blue shaded area)
  • #29: Here we have shifted the cut-point again In so doing, we have again reduced the number of false positives (the red shaded area), but at the expense of also further reducing the number of true positives (the blue shaded area) It should be clear therefore, that shifting the cut-point for differentiating “normal’ from ‘abnormal’ produces a trade-off between true positives (sensitivity) and false positives (1-specificity) This is precisely what is graphically illustrated by the ROC curve
  • #30: Predictive value data are often reported because they are intuitive They are calculated from the test point of view, in other words: Positive predictive value (PPV) asks “What proportion of subjects with a positive test actually have disease?” Negative predictive value (NPV) asks “What proportion of subjects with a negative test do not have disease?” The problem with the PPV/NPV measures is that they are affected by disease prevalence and may not truly reflect test accuracy in your clinical environment. In general, the lower the prevalence of the disease, the lower the positive predictive value. Despite their intuitive nature (do the test, then consider what the results mean), there is no good reason to report or use these values in practice because a value derived in one setting will likely not be valid in another setting with different disease prevalence.
  • #36: The likelihood ratio (LR) is the most clinically useful measure of diagnostic test accuracy. The LR associated with a test indicates by how much a given test result will raise or lower the pretest probability of the target disorder A LR of 1 means that the post-test probability is the same as the pretest probability LR values >1.0 increase the probability that the target disorder is present, and the higher the likelihood ratio, the greater is this increase. Conversely, LR values <1.0 decrease the probability of the target disorder, and the smaller the likelihood ratio, the greater is the decrease in probability and the smaller is its final value. Advantages of LR include that they can used directly in clinical reasoning to establish post-test probability and they are not affected by disease prevalence.
  • #37: The interpretation of a likelihood ratio depends on whether one is considering a “positive” test result of a “negative” test result The likelihood of a positive test results in a value greater than or equal to 1, with a larger number indicating greater likelihood that the disease is present Values greater than 5 are likely to result in a meaningful change in the post-test probability of the disease In contrast, the likelihood of a negative test results in a value less than or equal to 1, with a smaller number indicating greater likelihood that the disease is absent Values less than 0.2 are likely to result in meaningful change in the post-test probability of the disease
  • #38: We’ve seen this nomogram before – it helps to translate the pre-test probability into the post-test probability of disease once we calculated the likelihood ratio for the diagnostic test
  • #39: Spectrum Bias The population in which the test is being evaluated should be broadly representative of the population in which the test will be used Example – imagine we’re interested in the utility of SFEMG for the diagnosis of myasthenia gravis (MG), but we only evaluate the accuracy of this test in patients who are known to have elevated titers of antibodies directed against the acetylcholine receptor There is a high likelihood that the sensitivity and specificity of the test under consideration will vary depending on the spectrum of the patient population in which it is tested/applied Relevant to this question is the design of a study to evaluate the diagnostic accuracy of a test. Broadly speaking, there are two study designs: (a) case-control and (b) cohort study designs. In the former, a group of patients with the disorder of interest as well as a group of subjects known not to have the disease of interest (e.g. healthy controls) are selected. In the latter design, a consecutive series of subjects referred for evaluation for the disorder of interest, are revaluated using the index test It is well established that case-control study designs inflate estimates of both sensitivity and specificity Verification Bias Also known as ‘work-up’ bias The mistake here is to perform the reference standard only in a select group of subjects (e.g. to proceed to contrast angiography only in patients with a negative MR angiogram when evaluating for carotid dissection) Independence The index test and reference standard should be independent of each other Independence incorporates at least two difference concepts Incorporation Bias The mistake here is to have the reference standard include or incorporate the index test Under such circumstances, there will be artificial agreement between the index and reference tests, thereby inflating estimates of sensitivity and specificity Blinding The evaluator who performs the index test and the evaluator who applies the reference standard should each be blinded to the outcome/results of the other test Failure to achieve independence in this sense will also produce artificially high agreement between the index test and the reference standard and thereby inflate estimates of sensitivity and specificity
  • #40: Random Error As with any study, studies of diagnostic test accuracy should include a sufficient number of subjects to permit precise estimation of the test’s sensitivity and specificity It is fairly unusual for studies of the accuracy of a diagnostic test to report the confidence intervals around the point estimates of sensitivity and specificity An informal survey of the literature suggests that many (if not most) studies of the accuracy of diagnostic tests include too few participants to produce precise estimates of test sensitivity and specificity Random error may be quantified statistically with confidence intervals (CI) a range of values within which one can be confident that that the “true value” is estimated to lie 95% CI is standard but arbitrary defines the range that includes the true value 95% of the time Smaller interval (more precision) with larger sample size
  • #41: This nomogram demonstrates graphically the result of a clinical scenario in which the pre-test probability of a diagnosis was 50% and the results of diagnostic testing The blue lines represent the results of a positive test The middle (thicker) blue line represents the magnitude of effect (LR=10), resulting in a post-test probability estimate of about 92% The other (thinner) blue lines represent the 95% CI boundaries, which are associated with LR of 2.5 and 50, and therefore post-test probabilities of about 75-98% The red lines represent the results of a negative test The middle (thicker) red line represents the magnitude of effect (LR=0.2), resulting in a post-test probability estimate of about 19% The other (thinner) red lines represent the 95% CI boundaries, which are associated with LR of 0.1 and 0.5, and therefore post-test probabilities of about 10-35%
  • #42: Many tests require observer interpretation Clinical utility and generalizability are affected by the inter-observer agreement If there is poor agreement about a test result between different observers or in different situations, the test may not be useful Agreement is measured with the kappa (κ) statistic kappa expresses the extent of the possible agreement over and above chance If the raters agree on every judgment, the total possible agreement is always 100% Expressed in a range from -1.0 (perfect disagreement) to 1.0 (perfect agreement) Here are some useful numbers to aid in interpretation of the kappa statistic < 0 - no agreement 0.0 – 0.20 - slight agreement 0.21 – 0.40 - fair agreement 0.41 – 0.60 - moderate agreement 0.61 – 0.80 - substantial agreement 0.81 – 1.00 - almost perfect agreement
  • #43: STAndards for the Reporting of Diagnostic accuracy studies This consensus document summarizes reporting requirements for diagnostic accuracy studies 25-item checklist and flow-diagram Provides the basis for reporting and interpreting questions of diagnostic test validity, accuracy, precision, and inter-observer agreement http://www.stard-statement.org/
  • #44: This slide shows the first half of the STARD checklist Categories include Methods for defining the patient spectrum, evaluating the diagnostic test and reference standard, and performing the statistical analysis
  • #45: This slide shows the second half of the STARD checklist Categories include Results, broken down by patient description, disease characteristics, diagnostic accuracy, precision, and agreement reporting
  • #46: Recognize that diagnosis is usually achieved using hypothetical-deductive methods Formulate an appropriate diagnostic question when considering use of a test The clinical importance of a test is determined by BOTH the pretest probability of the disease and the test accuracy Diagnostic test accuracy is best expressed using LR and 95% CI STARD criteria can assist in appraisal of the methods used to evaluate a diagnostic test