SlideShare a Scribd company logo
Transparency and consistency
        Jonas Ranstam PhD
Scientific research
A systematic investigation ... designed to develop or contribute to
generalizable knowledge1.

Generalizable: Having predictive and reliable results.

When sampling errors don't exist or are irrelevant, qualitative
research methods (e.g. case reporting) can be used.

If sampling errors do exist, the unavoidable sampling uncertainty
must be quantified (quantitative research) and presented, usually
in terms of p-values and confidence intervals.


1
    The US National Science Foundation
Statistics
Medical researchers rely as never before on statistics for
generating and testing hypotheses and for estimating risks
and benefits of old and new therapies.

Journals can facilitate the writing and reading of research
reports by implementing clear guidelines for manuscript
preparation.
Milestones in scientific publication
1658 – the first scientific journals
1858 – the IMRAD structure
1957 – the abstract
1978 – the Vancouver convention (ICMJE)
1987 – the structured abstract
1997 – the CONSORT guidelines
2007 – the STROBE guidelines
Oac guidelines
Ten recommendations
 1. Purpose
 2. Data source
 3. Observations
 4. Descriptions
 5. Methods
 6. Assumptions
 7. Significance
 8. Confidence
 9. Multiplicity
10. Claims
1. Purpose
State the research question and the purpose of the study. Is
the ambition to describe an observation, to generate
hypotheses or to test a pre-specified hypothesis?
1. Purpose
State the research question and the purpose of the study. Is
the ambition to describe an observation, to generate
hypotheses or to test a pre-specified hypothesis?

Bad

We have shown that the success rate differs between two
common techniques for autologous chondrocyte implantation.
1. Purpose
State the research question and the purpose of the study. Is
the ambition to describe an observation, to generate
hypotheses or to test a pre-specified hypothesis?

Good

We designed an experiment to test the hypothesis of identical
success rates of two common techniques for autologous
chondrocyte implantation.
2. Data source
Describe the source of subjects, cadavers, animals, tissues,
cell line, etc. and how many of these units have been included
in the study.
2. Data source
Describe the source of subjects, cadavers, animals, tissues,
cell line, etc. and how many of these units have been included
in the study.
Bad

We collected 36 pieces of human cartilage.
2. Data source
Describe the source of subjects, cadavers, animals, tissues,
cell line, etc. and how many of these units have been included
in the study.
Good

Three pieces of cartilage from each of twelve physically active
men between 25 and 75 years of age, previously included as
healthy controls in a clinical trial (ref.), were collected for this
study.
3. Observations
When observations can be presented individually, either
numerically or graphically, this should be preferred. With fewer
than 4 observations it should be the rule.
3. Observations
When observations can be presented individually, either
numerically or graphically, this should be preferred. With fewer
than 4 observations it should be the rule.

Bad
3. Observations
When observations can be presented individually, either
numerically or graphically, this should be preferred. With fewer
than 4 observations it should be the rule.

Good
4. Descriptions
When presenting data in aggregated form, always present the
number of included observations as well as their average and
dispersion. If repeated measurements or replicates are
included, present both the number of independent samples and
the total number of observations.
4. Descriptions
When presenting data in aggregated form, always present the
number of included observations as well as their average and
dispersion. If repeated measurements or replicates are
included, present both the number of independent samples and
the total number of observations.

Bad

The mean change in total knee cartilage volume was 0.62 ml.
4. Descriptions
When presenting data in aggregated form, always present the
number of included observations as well as their average and
dispersion. If repeated measurements or replicates are
included, present both the number of independent samples and
the total number of observations.

Good

The mean change in total knee cartilage volume was 0.62
±1.3 ml (n=24).
5. Methods
Describe all used statistical methods in a statistics section. Use
the original names of the methods. These are not always the
same as the names used in software packages.
5. Methods
Describe all used statistical methods in a statistics section. Use
the original names of the methods. These are not always the
same as the names used in software packages.

Bad

We used the independent groups t-test in the group comparison.
5. Methods
Describe all used statistical methods in a statistics section. Use
the original names of the methods. These are not always the
same as the names used in software packages.

Good

We used Satterthwaite's t-test in the group comparison.
6. Assumptions
The validity of statistical results rely on certain assumptions
being fulfilled. Were they?
6. Assumptions
The validity of statistical results rely on certain assumptions
being fulfilled. Were they?

The man of science has learned to believe in justification, not by
faith, but by verification.

Thomas Huxley, 1866
6. Assumptions
The validity of statistical results rely on certain assumptions
being fulfilled. Were they?

Good

The ANOVA residual was examined using a normal probability
plot, which indicated a Gaussian distribution.

The homogeneity of variance was tested using Levene's test.

The assumption of proportional hazards was investigated using
hypothesis tests of Schoenfeld residuals.
7. Significance
A p-value describes the uncertainty in the generalization (the
outcome of a hypothesis test), and has no relevance for the
observed sample itself.

Distinguish between practical and statistical significance. Clarify
what hypotheses are tested.
7. Significance
A p-value describes the uncertainty in the generalization (the
outcome of a hypothesis test), and has no relevance for the
observed sample itself.

Distinguish between practical and statistical significance. Clarify
what hypotheses are tested.

Bad

There was no difference in mean systolic blood pressure
between treated patients (190 mmHg) and controls (135 mmHg)
(p = 0.06).
7. Significance
A p-value describes the uncertainty in the generalization (the
outcome of a hypothesis test), and has no relevance for the
observed sample itself.

Distinguish between practical and statistical significance. Clarify
what hypotheses are tested.

Good

Treated patients had in this study higher mean systolic blood
pressure than controls, 190 vs. 135 mmHg. The observation,
even if not statistically significant (p = 0.06), raises concern for
future treatment.
8. Confidence
The uncertainty in the generalization of a finding is often better
presented using the two limits of a confidence interval, indicating
plausible values, than one probability of a false positive
conclusion.
8. Confidence
The uncertainty in the generalization of a finding is often better
presented using the two limits of a confidence interval, indicating
plausible values, than one probability of a false positive
conclusion.

Bad

The reproducibility was high (ICC = 0.91; p < 0.0001).
8. Confidence
The uncertainty in the generalization of a finding is often better
presented using the two limits of a confidence interval, indicating
plausible values, than one probability of a false positive
conclusion.

Good

The reproducibility was high (ICC = 0.91; 95%Ci: 0.64 - 0.98).
9. Multiplicity
All departures from the conventional levels of 5% significance
and 95% confidence, like the ones achieved by using one-sided
tests, Bonferroni corrections, and simultaneous confidence
intervals, should be explained and motivated.
9. Multiplicity
All departures from the conventional levels of 5% significance
and 95% confidence, like the ones achieved by using one-sided
tests, Bonferroni corrections, and simultaneous confidence
intervals, should be explained and motivated.
Bad

We have in this randomized trial shown that patients born under
the astrological sign of Gemini benefit aspirin treatment more
than others.
9. Multiplicity
All departures from the conventional levels of 5% significance
and 95% confidence, like the ones achieved by using one-sided
tests, Bonferroni corrections, and simultaneous confidence
intervals, should be explained and motivated.
Good

When multiplicity issues were taken into account, we were
unable to find any interaction between astrological sign and
benefit from aspirin treatment.
10. Claims
The level of statistical rigor (precision and addressed uncertainty
issues) should be consistent with the author's purpose and
conclusions.
What is all this fuss about confidence
intervals and clinical significance?

Questions that can be answered using p-values

- Can I be sure that there is an effect?


Questions that can be answered using confidence intervals

- Can I be sure that there is an effect?

- Can I be sure that there isn't an effect?

- What effect is there?
P-values
Statistical significance
                                    p < 0.05 or n.s.




                           Confidence intervals
                           Statistical and clinical significance




                                                             Effect
                   0
                             Clinically significant effect
Statements that should be avoided
- “Statistical difference”
- “Significant difference”
- “There was no difference”
- “ns” and “p > 0.05”
- “p < 0.03”
Thank you for your attention

More Related Content

PPT
Copenhagen 23.10.2008
PPT
Biostatistics in cancer RCTs
PPT
PPT
Statistics basics for oncologist kiran
PDF
2014-10-22 EUGM | WEI | Moving Beyond the Comfort Zone in Practicing Translat...
PPTX
Minimally important differences
PDF
Ct lecture 6. test of significance and test of h
PPTX
Approximate ANCOVA
Copenhagen 23.10.2008
Biostatistics in cancer RCTs
Statistics basics for oncologist kiran
2014-10-22 EUGM | WEI | Moving Beyond the Comfort Zone in Practicing Translat...
Minimally important differences
Ct lecture 6. test of significance and test of h
Approximate ANCOVA

What's hot (20)

PPTX
Depersonalising medicine
PDF
A Lenda do Valor P
PPTX
Statistical Methods for Removing Selection Bias In Observational Studies
PPT
The SPSS-effect on medical research
PPTX
Personalised medicine a sceptical view
PPTX
What is your question
PDF
Non inferiority trials: any advantage for patients?
PPTX
Numbers needed to mislead
PPT
Analysis and Interpretation
PPTX
Clinical trials: quo vadis in the age of covid?
PPT
Yates and cochran
PPTX
To infinity and beyond
PPTX
The challenge of small data
PPTX
Has modelling killed randomisation inference frankfurt
PDF
Hypothesis Tests in R Programming
PDF
The Rothamsted school meets Lord's paradox
PPTX
To infinity and beyond v2
PDF
Evidencia en el tratamiento 2013
PPTX
Vaccine trials in the age of COVID-19
PDF
Statistics in clinical and translational research common pitfalls
Depersonalising medicine
A Lenda do Valor P
Statistical Methods for Removing Selection Bias In Observational Studies
The SPSS-effect on medical research
Personalised medicine a sceptical view
What is your question
Non inferiority trials: any advantage for patients?
Numbers needed to mislead
Analysis and Interpretation
Clinical trials: quo vadis in the age of covid?
Yates and cochran
To infinity and beyond
The challenge of small data
Has modelling killed randomisation inference frankfurt
Hypothesis Tests in R Programming
The Rothamsted school meets Lord's paradox
To infinity and beyond v2
Evidencia en el tratamiento 2013
Vaccine trials in the age of COVID-19
Statistics in clinical and translational research common pitfalls
Ad

Viewers also liked (8)

PDF
Atomic Business Overview 2009 Linkin
PPT
Nara guidelines-jr
PPT
Vicky
DOC
Calendario De Actividades Febrero Del 2009 Medicos Pasantes Umf 54
PDF
PDF
Powernet Installation Schematic
PPT
Vicky
PDF
Brussels 2010
Atomic Business Overview 2009 Linkin
Nara guidelines-jr
Vicky
Calendario De Actividades Febrero Del 2009 Medicos Pasantes Umf 54
Powernet Installation Schematic
Vicky
Brussels 2010
Ad

Similar to Oac guidelines (20)

PPT
Copenhagen 2008
PPT
Coursebooklet
PPTX
Understanding clinical trial's statistics
PDF
Lemeshow samplesize
PDF
Critical appraisal: How to read a scientific paper?
PPT
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
PPT
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
PPT
London 2008
PPT
Malmo 11.11.2008
PPTX
5.INFERENTIAL STATISTICS-GENERAL CONCEPTS.pptx
PPTX
TEST OF SIGNIFICANCE.pptx
PPTX
Basics of Statistics.pptx
PPT
02 Study Designs - Research Methodology Workshop - Aug 2011.ppt
PPT
Published Research, Flawed, Misleading, Nefarious - Use of Reporting Guidelin...
PPT
Prague 02.10.2008
PPTX
INTERPRETATION OF STATISTICAL TESTS.pptx
PPT
Critical Appriaisal Skills Basic 1 | May 4th 2011
 
PPTX
Practical Methods To Overcome Sample Size Challenges
Copenhagen 2008
Coursebooklet
Understanding clinical trial's statistics
Lemeshow samplesize
Critical appraisal: How to read a scientific paper?
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
London 2008
Malmo 11.11.2008
5.INFERENTIAL STATISTICS-GENERAL CONCEPTS.pptx
TEST OF SIGNIFICANCE.pptx
Basics of Statistics.pptx
02 Study Designs - Research Methodology Workshop - Aug 2011.ppt
Published Research, Flawed, Misleading, Nefarious - Use of Reporting Guidelin...
Prague 02.10.2008
INTERPRETATION OF STATISTICAL TESTS.pptx
Critical Appriaisal Skills Basic 1 | May 4th 2011
 
Practical Methods To Overcome Sample Size Challenges

More from Jonas Ranstam PhD (20)

PPT
Sof stat issues_pro
PPT
Sof klin forsk_stat
PDF
Rcsyd pres nara
PPT
Prague 2008
PDF
Odense 2010
PPT
PPT
Oac beijing jr
PPT
Norsminde 2009
PPT
Malmo 30 03-2012
PDF
PDF
Lecture jr
PPT
Karlskrona 2009
PPT
Datavalidering jr1
PPT
Amsterdam 2008
PPT
Actalecturerungsted
PDF
Umeapresjr
PPT
Stockholm 6 7.11.2008
PPT
Malmo 17.10.2008
PPT
Lund 30.09.2008
PPT
London 21.11.2008
Sof stat issues_pro
Sof klin forsk_stat
Rcsyd pres nara
Prague 2008
Odense 2010
Oac beijing jr
Norsminde 2009
Malmo 30 03-2012
Lecture jr
Karlskrona 2009
Datavalidering jr1
Amsterdam 2008
Actalecturerungsted
Umeapresjr
Stockholm 6 7.11.2008
Malmo 17.10.2008
Lund 30.09.2008
London 21.11.2008

Oac guidelines

  • 1. Transparency and consistency Jonas Ranstam PhD
  • 2. Scientific research A systematic investigation ... designed to develop or contribute to generalizable knowledge1. Generalizable: Having predictive and reliable results. When sampling errors don't exist or are irrelevant, qualitative research methods (e.g. case reporting) can be used. If sampling errors do exist, the unavoidable sampling uncertainty must be quantified (quantitative research) and presented, usually in terms of p-values and confidence intervals. 1 The US National Science Foundation
  • 3. Statistics Medical researchers rely as never before on statistics for generating and testing hypotheses and for estimating risks and benefits of old and new therapies. Journals can facilitate the writing and reading of research reports by implementing clear guidelines for manuscript preparation.
  • 4. Milestones in scientific publication 1658 – the first scientific journals 1858 – the IMRAD structure 1957 – the abstract 1978 – the Vancouver convention (ICMJE) 1987 – the structured abstract 1997 – the CONSORT guidelines 2007 – the STROBE guidelines
  • 6. Ten recommendations 1. Purpose 2. Data source 3. Observations 4. Descriptions 5. Methods 6. Assumptions 7. Significance 8. Confidence 9. Multiplicity 10. Claims
  • 7. 1. Purpose State the research question and the purpose of the study. Is the ambition to describe an observation, to generate hypotheses or to test a pre-specified hypothesis?
  • 8. 1. Purpose State the research question and the purpose of the study. Is the ambition to describe an observation, to generate hypotheses or to test a pre-specified hypothesis? Bad We have shown that the success rate differs between two common techniques for autologous chondrocyte implantation.
  • 9. 1. Purpose State the research question and the purpose of the study. Is the ambition to describe an observation, to generate hypotheses or to test a pre-specified hypothesis? Good We designed an experiment to test the hypothesis of identical success rates of two common techniques for autologous chondrocyte implantation.
  • 10. 2. Data source Describe the source of subjects, cadavers, animals, tissues, cell line, etc. and how many of these units have been included in the study.
  • 11. 2. Data source Describe the source of subjects, cadavers, animals, tissues, cell line, etc. and how many of these units have been included in the study. Bad We collected 36 pieces of human cartilage.
  • 12. 2. Data source Describe the source of subjects, cadavers, animals, tissues, cell line, etc. and how many of these units have been included in the study. Good Three pieces of cartilage from each of twelve physically active men between 25 and 75 years of age, previously included as healthy controls in a clinical trial (ref.), were collected for this study.
  • 13. 3. Observations When observations can be presented individually, either numerically or graphically, this should be preferred. With fewer than 4 observations it should be the rule.
  • 14. 3. Observations When observations can be presented individually, either numerically or graphically, this should be preferred. With fewer than 4 observations it should be the rule. Bad
  • 15. 3. Observations When observations can be presented individually, either numerically or graphically, this should be preferred. With fewer than 4 observations it should be the rule. Good
  • 16. 4. Descriptions When presenting data in aggregated form, always present the number of included observations as well as their average and dispersion. If repeated measurements or replicates are included, present both the number of independent samples and the total number of observations.
  • 17. 4. Descriptions When presenting data in aggregated form, always present the number of included observations as well as their average and dispersion. If repeated measurements or replicates are included, present both the number of independent samples and the total number of observations. Bad The mean change in total knee cartilage volume was 0.62 ml.
  • 18. 4. Descriptions When presenting data in aggregated form, always present the number of included observations as well as their average and dispersion. If repeated measurements or replicates are included, present both the number of independent samples and the total number of observations. Good The mean change in total knee cartilage volume was 0.62 ±1.3 ml (n=24).
  • 19. 5. Methods Describe all used statistical methods in a statistics section. Use the original names of the methods. These are not always the same as the names used in software packages.
  • 20. 5. Methods Describe all used statistical methods in a statistics section. Use the original names of the methods. These are not always the same as the names used in software packages. Bad We used the independent groups t-test in the group comparison.
  • 21. 5. Methods Describe all used statistical methods in a statistics section. Use the original names of the methods. These are not always the same as the names used in software packages. Good We used Satterthwaite's t-test in the group comparison.
  • 22. 6. Assumptions The validity of statistical results rely on certain assumptions being fulfilled. Were they?
  • 23. 6. Assumptions The validity of statistical results rely on certain assumptions being fulfilled. Were they? The man of science has learned to believe in justification, not by faith, but by verification. Thomas Huxley, 1866
  • 24. 6. Assumptions The validity of statistical results rely on certain assumptions being fulfilled. Were they? Good The ANOVA residual was examined using a normal probability plot, which indicated a Gaussian distribution. The homogeneity of variance was tested using Levene's test. The assumption of proportional hazards was investigated using hypothesis tests of Schoenfeld residuals.
  • 25. 7. Significance A p-value describes the uncertainty in the generalization (the outcome of a hypothesis test), and has no relevance for the observed sample itself. Distinguish between practical and statistical significance. Clarify what hypotheses are tested.
  • 26. 7. Significance A p-value describes the uncertainty in the generalization (the outcome of a hypothesis test), and has no relevance for the observed sample itself. Distinguish between practical and statistical significance. Clarify what hypotheses are tested. Bad There was no difference in mean systolic blood pressure between treated patients (190 mmHg) and controls (135 mmHg) (p = 0.06).
  • 27. 7. Significance A p-value describes the uncertainty in the generalization (the outcome of a hypothesis test), and has no relevance for the observed sample itself. Distinguish between practical and statistical significance. Clarify what hypotheses are tested. Good Treated patients had in this study higher mean systolic blood pressure than controls, 190 vs. 135 mmHg. The observation, even if not statistically significant (p = 0.06), raises concern for future treatment.
  • 28. 8. Confidence The uncertainty in the generalization of a finding is often better presented using the two limits of a confidence interval, indicating plausible values, than one probability of a false positive conclusion.
  • 29. 8. Confidence The uncertainty in the generalization of a finding is often better presented using the two limits of a confidence interval, indicating plausible values, than one probability of a false positive conclusion. Bad The reproducibility was high (ICC = 0.91; p < 0.0001).
  • 30. 8. Confidence The uncertainty in the generalization of a finding is often better presented using the two limits of a confidence interval, indicating plausible values, than one probability of a false positive conclusion. Good The reproducibility was high (ICC = 0.91; 95%Ci: 0.64 - 0.98).
  • 31. 9. Multiplicity All departures from the conventional levels of 5% significance and 95% confidence, like the ones achieved by using one-sided tests, Bonferroni corrections, and simultaneous confidence intervals, should be explained and motivated.
  • 32. 9. Multiplicity All departures from the conventional levels of 5% significance and 95% confidence, like the ones achieved by using one-sided tests, Bonferroni corrections, and simultaneous confidence intervals, should be explained and motivated. Bad We have in this randomized trial shown that patients born under the astrological sign of Gemini benefit aspirin treatment more than others.
  • 33. 9. Multiplicity All departures from the conventional levels of 5% significance and 95% confidence, like the ones achieved by using one-sided tests, Bonferroni corrections, and simultaneous confidence intervals, should be explained and motivated. Good When multiplicity issues were taken into account, we were unable to find any interaction between astrological sign and benefit from aspirin treatment.
  • 34. 10. Claims The level of statistical rigor (precision and addressed uncertainty issues) should be consistent with the author's purpose and conclusions.
  • 35. What is all this fuss about confidence intervals and clinical significance? Questions that can be answered using p-values - Can I be sure that there is an effect? Questions that can be answered using confidence intervals - Can I be sure that there is an effect? - Can I be sure that there isn't an effect? - What effect is there?
  • 36. P-values Statistical significance p < 0.05 or n.s. Confidence intervals Statistical and clinical significance Effect 0 Clinically significant effect
  • 37. Statements that should be avoided - “Statistical difference” - “Significant difference” - “There was no difference” - “ns” and “p > 0.05” - “p < 0.03”
  • 38. Thank you for your attention