Biostatistics Workshop: Sample Size & Power

1
BIOSTATISTICS WORKSHOP:
SAMPLE SIZE & POWER
Sub-Saharan Africa CFAR meeting
July 18, 2016
Durban, South Africa
Memory loss and Dementia in HIV
◦ Does HIV infection accelerate onset of memory loss at advanced ages?
◦ What do we need to determine in planning this study?
◦ Study design?
◦ Study participants?
◦ Endpoints? Measures of memory loss?
◦ Covariates / possible confounders to collect?
◦ Statistical Analysis Plan (SAP)
◦ Who should be involved in the process of planning a study?

2
◦ We want to recruit a sample of HIV+ & HIV- individuals between ages of 55 and 65 and
test their memory
◦ Primary endpoint: Memory as a continuous measure where lower values indicate worse
memory
◦ Secondary endpoint: Self-assessed memory impairment (“Do you feel your memory today is
worse than three years ago?”)
◦ Confounders
◦ Age, medication use & duration, age at HIV onset?
◦ Statistical Analysis Plan?
◦ How many participants do you need to see a meaningful difference (if one exists)?
Sample Size Calculations
◦ Before beginning a study you want to determine how many subjects you will need to
enroll
◦ To see the desired / expected effect
◦ To have a high probability that that effect is statistically significant (assuming the effect exists)
◦ Power calculations can be used to:
◦ Determine the sample size needed
◦ Determine the power given a fixed or maximum sample size
◦ Determine the detectable effect size given a sample size and power

3
So, do we really need to do this?
◦ Yes!!!!
◦ If a sample size isn’t large enough,
◦ we may conclude a null result (even if there truly is an effect) due to a lack of
statistical power (type II error)
◦ If sample size is too large,
◦ we have wasted valuable resources (time, $, etc.)
Our decisions & mistakes
Reality!
H0 is true
There is not a difference
H0 is False
There is a difference
Conclusion
Do Not Reject H0
There is not a
difference
Correct

Type II Error
Reject H0
There is a difference
Type I Error Correct

P(type I error) = P(rejecting H0|H0 is true)= α
P(type II error) = P(not rejecting H0|H0 is false)= β
P(rejecting H0|H0 is false) = 1‐β = POWER

4
Type I vs. Type II error
Type I Error Type II Error
You’re
Pregnant!
You’re not
Pregnant
Type I vs Type II error
◦ Type I error: Significance Level
◦ α = P(rejecting H0|H0 is true)
◦ Incorrectly concluding that there is a difference when there truly is not a difference
(concluding a drug works, when it in fact does not)
◦ False Positive
◦ Typically set at 5% overall
◦ Type II Error: Power
◦ P(not rejecting H0 | H0 if false)
◦ Correctly concluding that an effect exists when it does (finding a drug works when in fact it
does)
◦ True Positive
◦ Expressed as a %, typical values 80% & 90%
Societal Risk
Institutional Risk

5
test their memory
memory
Sample Size Calculation
◦ What you need
◦ Estimate of expected effect size
◦ Estimate of expected variability
◦ Significance level
◦ Typically α = 0.05
◦ Take into account # of endpoints and tests
◦ Power – Probability of finding a significant effect given that effect exists

6
SS Calculation: Effect Size
◦ Depends on analysis to be performed
◦ Difference in means? OR? RR?
◦ Clinically Meaningful / Relevant Difference
◦ Realistic, but reasonable
◦ How to get an estimate
◦ Previous literature
◦ Pilot Study
◦ Clinically Meaningful
◦ Increments of Standard Error
SS Calculation: Variability
◦ Usually measured as Standard Deviation or proportion expected in each group
◦ Clinically Meaningful / Relevant Difference
◦ Realistic, but reasonable
◦ How to get an estimate
◦ Previous literature
◦ Pilot Study
◦ Tricks?
◦ (max – min) / 4
◦ Err on side of over-estimate

7
◦ Compare HIV+ and HIV- individuals on a continuous, normally-distributed memory
score
◦ 2-sample t-test
◦ We find previous literature using this measure with HIV- individuals and they reported,
for 55-65 years olds, a mean of 20 with a standard deviation of 5
◦ We want to see a difference of 1 SD (5 units) between the groups
◦ How many subjects do we need to recruit in each group to see a difference of 5 units?
R Programming
◦ Free Software program available to download
◦ www.r-project.org
◦ I will show you some very simple code for straight-forward sample size calculations
◦ Many more examples can be found just a google away!

8
Sample Size in R
Parameters we need to addR function
Delta = difference in means
Sample Size in R
Need 17 HIV+ and 17 HIV- to find a difference in 5 units in memory score

9
Sample Size
◦ So what happens if we enroll 17 + 17 people and the true difference is actually
less than 5 units?
◦ We consult the neuropsych people and they say that a difference in just 2
units would be considered clinically meaningful
◦ How many subjects do we need to recruit in each group to see a difference of
2 units?
More or less than 34?
Sample Size in R
What do we have to change?

10
Sample Size in R
How many do we need in each group to see a difference in 2 units in memory score?
Sample Size
◦ We are gambling people, so we want to up our probability of finding
significance (if effect exists) so we will increase power to 90%
◦ How many subjects do we need to recruit in each group to see a difference of
2 units at 90% power?
More or less than 200?

11
Sample Size in R
Sample Size in R
How many do we need in each group to see a difference in 2 units in memory score
at 90% power?

12
Sample Size
◦Smaller differences
◦Larger standard deviations
◦More power
◦Stronger type I error control
(smaller
◦More narrow CI
n
n
n
n
n
n
n
n
n
Calculating power from n
◦ Sometimes you have a fixed n and want to calculate power to
find a particular effect size
◦ Sometimes you reach the end of your study, fail to reject H0 and
want to see if you had enough power to find significance for the
effect size you have
◦ “Post-hoc Power Calculation”

13
Power in R
Whatever you don’t indicate is what R calculates
Power in R
Do we reach 80% power with 99 in each group?

14
Power in R
What does 80% power really
mean?
Is there something magical
about 80% power?
Quick Re-Cap
◦ Compare HIV+ and HIV- individuals on a continuous, normally-distributed memory
score
◦ 2-sample t-test
◦ We need to enroll 100 HIV+ and 100 HIV- individuals into study to see a difference in
means of 2 units at 80% power
◦ At 90% power we need to enroll 133 HIV+ and HIV- individuals into the study

15
Hypothetical Study Question
test their memory
memory
◦ Compare HIV+ and HIV- individuals on a binary variable
◦ Chi-Square test, our effect measure is OR (case-control study)
◦ We find previous literature using this measure with HIV- individuals and 15% reported
having experienced worse memory than 3 years earlier.
◦ We think that HIV+ people will have 2xs the odds of reporting worse memory
◦ How many subjects do we need to recruit in each group to see an odds ratio = 2?

16
Risk (p) vs. Odds (o)
p
p
o


1 o
o
p


1
 
2
2
1
1
2
1
1
1
p
p
p
p
o
o
ORratioodds



22
2
1
*1
*
pORp
pOR
p


We find previous literature using this measure with HIV-
individuals and 15% reported having experienced
worse memory than 3 years earlier.
In this case 15% is a proportion or ‘risk’ and we need to
calculate an OR
For simple R sample size calculations we need p1 & p2
We have p2 (15%) & OR, need to estimate p1
p
p
o


1 o
o
p


1
 









HIV
HIV
HIV
HIV
HIV
HIV
p
p
p
p
o
o
ORratioodds
1
1





HIVHIV
HIV
HIV
pORp
pOR
p
*1
*
calculate an OR

17
p
p
o


1 o
o
p


1
15.01
15.0
1
0.2


 

HIV
HIV
p
p
OR
26.0
15.0*0.215.01
15.0*0.2


HIVp
calculate an OR
Sample Size in R
Parameters we need to addR function

18
Sample Size in R
So see an OR=2, at 80% power and a proportion in the HIV-
group = 15% we will need 211 in each group
◦ Compare HIV+ and HIV- individuals on a binary variable
◦ Chi-Square test, our effect measure is OR (case-control study)
◦ We decide we want to study 65-75 year olds. In that population 30% of HIV- individuals
report experiencing worse memory than 3 years earlier.
◦ We still think that HIV+ people will have 2xs the odds of reporting worse memory
◦ How many subjects do we need to recruit in each group to see an odds ratio = 2?

19
Sample Size in R
p
p
o


1 o
o
p


1
35.01
35.0
1
0.2


 

HIV
HIV
p
p
OR
52.0
35.0*0.235.01
35.0*0.2


HIVp
calculate an OR

20
Sample Size in R
Sample Size in R
So see an OR=2, at 80% power and a proportion in the HIV-
group = 35% we will need 133in each group

21
Sample Size presentation
◦ Not uncommon to present multiple possibilities in a power/sample size section of a
grant.
◦ Vary effect size, power and variability
◦ Do NOT vary significance level!
HIV+
(p)
HIV-
(p)
OR Power n per
group
0.26 0.15 2.0 80% 211
0.26 0.15 2.0 90% 281
0.35 0.15 3.0 80% 73
0.35 0.15 3.0 90% 97
Sample Size: Notes
◦ Calculation (once you have the inputs) is relatively simple, but estimation of ES can be
difficult
◦ Important to be conservative but maintain reason when estimating parameters
◦ Small changes in some parameters may have a large effect on the power
◦ In the end, it’s often a balancing act
◦ Take into account the # of tests and endpoints you have.
◦ Adjust alpha (sig.level in R) to control for multiple comparisons

22
◦ What if we wanted to follow participants and measure change in memory over time.
◦ Longitudinal study
◦ Visit them at baseline, year 1, year 2 and year 3
◦ At the end of 3 years we ask them “Do you feel your have worse memory than 3 years
ago?”
◦ Does the change in study design effect our sample size calculation?
Longitudinal Study
◦ How many people do we need to enroll in each clinic (at 80% power) to see an
OR=2.0?

23
Longitudinal Study
◦ For prospective studies
◦ Need to take into account ‘drop-outs’
◦ Say you enroll 211 people at baseline
◦ Can you realistically expect to see 211 people at 1 year follow-up?
◦ What about at 3 years?
◦ Sample Size calculation is for number needed at END of study
◦ So you need an additional estimate for expected “loss to follow-up”
Longitudinal Study
◦ How many people do we need to enroll in each clinic (at 80% power) to see an
OR=2.0?
baseline 1 year
visit
3 year
visit
Need
n=211
2 year
visit

24
Longitudinal Study
◦ Start more simple
◦ Let’s say it was a 1 year study
◦ We expect to lose 10%
baseline 1 year
visit
Need
n=211
X
211
100
90

If we expect to lose 10%, that
means that at 1 month we
expect to have 90% of what we
started with
90
211*100
X
4.234X
grouppersubjects235
longitudinal: another option
◦ Start more simple
◦ Let’s say it was a 1 month trail
◦ We expect to lose 10%
baseline 1 year
visit
Need
n=211
X
211
90.0 
If we expect to lose 10%, that
means that at 1 month we
expect to have 90% of what we
started with
9.0
211
X
4.234X
grouppersubjects235

25
Longitudinal Study
◦ Now do that 3 times
baseline 1 year
visit
3 year
visit
Need n=211
Lose 10%Lose 10%Lose 10%
n=211/0.9=235n=235/0.9=262n=262/0.9=292
2 year
visit
Longitudinal: in 1 step?
baseline 1 year
visit
3 year
visit
Need
n=211
Lose 10%Lose 10%Lose 10%
290
9.0
211
3
n
Lose some people due to rounding
2 year
visit

26
Sample Size: Common PitFalls
◦ Drop outs
◦ Secondary Endpoints
◦ Multiplicity
◦ Recognizing Futility
◦ Choosing the wrong endpoint
◦ Massaging the parameters to get 80% power will not help you in the end!!!
Sample Size: Notes
◦ Calculation (once you have the inputs) is relatively simple, but estimation of ES can be
difficult
◦ Important to be conservative but maintain reason when estimating parameters
◦ Small changes in some parameters may have a large effect on the power
◦ In the end, it’s often a balancing act

27
Sample Size: Summary
◦ Perform sample size calculations during the design phase of your research
◦ Ensure that you will have enough power to detect a difference if one exists
◦ Absence of evidence of an effect is not the same as evidence of absence of an effect
(power may be too low)
◦ Know when to consult a statistician!
To consult the statistician after an experiment is finished is often merely to
ask him to conduct a post-mortem examination. He can perhaps say
what the experiment died of.
R.A Fisher (1890-1962)
Sample Size: Software
◦ GraphPad Prism
◦ researcher user friendly
◦ point and click
◦ Free online tools (genetics based)
◦ Shaun Purcell: http://pngu.mgh.harvard.edu/~purcell/gpc/
◦ Quanto: http://hydra.usc.edu/gxe/
◦ Harvard/MGH: http://hedwig.mgh.harvard.edu/sample_size/size.html
◦ Others out there… but beware!
◦ R & R Studio www.r-project.org

Biostatistics Workshop: Sample Size & Power

More Related Content

What's hot (20)

Similar to Biostatistics Workshop: Sample Size & Power (20)

More from HopkinsCFAR (20)

Recently uploaded (20)

Biostatistics Workshop: Sample Size & Power