Seven Myths of Randomisation
in Clinical Trials
Stephen Senn
1(c)Stephen Senn 2011-2015
Why this talk
• I had begun to notice that there were a
number of published criticisms of
randomisation in the methodology of science
literature of randomisation
• These seemed to be accepted as valid by
others
• I felt a refutation was called for
2(c)Stephen Senn 2011-2015
The Magnificent Seven
• Patients are treated simultaneously
• Balance is necessary for valid inference
• Observed covariates can be ignored
• Randomisation is not necessary for blinding
• Randomisation is inefficient
• Randomisation precludes balancing
• Large trials have better balance
3(c)Stephen Senn 2011-2015
Outline
• A game of chance
• The seven myths
• My philosophy of randomisation and analysis
4(c)Stephen Senn 2011-2015
Game of Chance
• Two dice are rolled
– Red die
– Black die
• You have to call correctly the odds of a total score of 10
• Three variants
– Game 1 You call the odds and the dice are rolled together
– Game 2 the red die is rolled first, you are shown the score
and then must call the odds
– Game 3 the Game 2 the red die is rolled first, you are not
shown the score and then must call the odds
5(c)Stephen Senn 2011-2015
Total Score when Rolling Two Dice
Variant 1. Three of 36 equally likely results give a 10. The probability is 3/36=1/12.
6(c)Stephen Senn 2011-2015
Variant 2: If the red die score is 1,2 or 3, probability of a total of10 is 0. If
the red die score is 4,5 or 6 the probability of a total of10 is 1/6.
Variant 3: The probability = (½ x 0) + (½ x 1/6) = 1/12
Total Score when Rolling Two Dice
7(c)Stephen Senn 2011-2015
The Morals
• You can’t treat game 2 like game 1.
– You must condition on the information you receive in order to act
wisely
– You must use the actual data from the red die
• You can treat game 3 like game 1.
– You can use the distribution in probability that the red die has
• You can’t ignore an observed prognostic covariate in analysing
a clinical trial just because you randomised
– That would be to treat game 2 like game 1
• You can ignore an unobserved covariate precisely because you
did randomise
– Because you are entitled to treat game 3 like game 1
8(c)Stephen Senn 2011-2015
Trialists continue to use their
randomization as an excuse for ignoring
prognostic information (myth 3), and they
continue to worry about the effect of
factors they have not measured (myth 2).
Neither practice is logical.
The Reality
9(c)Stephen Senn 2011-2015
Myth 1: Patients are treated
simultaneously
If, having created groups matched with respect to those ‘known’
factors, one then goes on to decide which will be the
experimental and which the control group by some random
process—in the simplest case by tossing a fair coin—then one
can do no epistemic harm, though one also does no further
epistemic good. Worrall 2007, p463.
For example, one could arrange for the matching to be
performed by a panel of doctors representing a spectrum of
opinion on the likely value of the drugs and whose criteria of
selection have been made explicit. Urbach, 1985, p272
10(c)Stephen Senn 2011-2015
All this is pretty obvious
• The point is that it is obvious to us
• It is not obvious to them
– Critics of randomisation writing on clinical trials
• You need to tell them to abandon the deep-
freeze microwave theory of clinical trials
• You can’t thaw patients out just when it suits
you
11(c)Stephen Senn 2011-2015
Myth 2:
Balance is necessary for validity
• It is generally held as being self evident that a
trial which is not balanced is not valid.
• Trials are examined at baseline to establish
their validity.
• In fact the matter is not so simple...........
12(c)Stephen Senn 2011-2015
A Tale of Two Tables
Trial 1 Treatment
Sex Verum Placebo Total
Male 34 26 60
Female 15 25 40
Total 49 51 100
Trial 2 Treatment
Sex Verum Placebo
Male 26 26 52
Female 15 15 30
Total 41 41 82
• Trial two balanced
but trial one not
• Surely trial two
must be more
reliable
• Things are not so
simple
13(c)Stephen Senn 2011-2015
A Tale of Two Tables
Trial 1 Treatment
Sex Verum Placebo Total
Male 26+8 26 60
Female 15 15+10 40
Total 49 51 100
Trial 2 Treatment
Sex Verum Placebo
Male 26 26 52
Female 15 15 30
Total 41 41 82
• Trial two contains trial
one
• How can more
information be worse
than less
• If statistical theory could
not deal with Trial 1
there would be
something wrong with it
14(c)Stephen Senn 2011-2015
Stratification
All we need to do is compare like with like.
If we compare males with males and females with females we
shall obtain two unbiased estimators of the treatment effects.
These can then be combined in some appropriate way. This
technique is called stratification.
A similar approach called analysis of covariance is available to deal
with continuous covariates such as height, age or a baseline
measurement.
15(c)Stephen Senn 2011-2015
What you learn in your first regression
course
1 11 1 0 1
2 12 2 1 2
1
1
1 ...
1 ...
k
k
n n kn k n
Y X X
Y X X
Y X X
 
 
 
       
       
          
       
       
       
X β ε
Y = Xβ +ε
L
M M M O M M M
Y
 ˆ  
-1
β = X X X Y  
1 2ˆ ˆ( ) , ( ) .E V 

 β β β XX
16(c)Stephen Senn 2011-2015
1 2
11 12 1
12 22 2
1
22
ˆvar( ) ( )
2/
k
k kk
X X
a a a
a a
a a
a n
 



 
 
 
 
 
 

The value of 2 depends on the
model.
For a given model, the value of
a22 depends on the design and
this only achieves its lower
bound when covariates are
balanced.
The Value of Balance
Variance multiplier for the treatment
effect
17(c)Stephen Senn 2011-2015
Myth 3
Observed covariates can be ignored
• This is wrong whether or not covariates are imbalanced
• Nobody would analyse a matched pairs design like a
completely randomised design
• However two classes of statisticians are implicitly signing up
to this
– Those who minimise
– Those who use the propensity score
18(c)Stephen Senn 2011-2015
The Problem with Minimisation
• Many public sector trials are minimised but
not strictly randomised
– That is to say a dynamic form of balancing is
employed
• Often the covariates used for balancing are
not fitted in the model
19(c)Stephen Senn 2011-2015
Typical MRC Stuff
‘The central telephone randomisation system used a minimisation algorithm to
balance the treatment groups with respect to eligibility criteria and other major
prognostic factors.’ (p24)
‘All comparisons involved logrank analyses of the first occurrence of particular
events during the scheduled treatment period after randomisation among all those
allocated the vitamins versus all those allocated matching placebo capsules (ie,
they were “intention-to treat” analyses).’ (p24)
1. (2002) MRC/BHF Heart Protection Study of cholesterol lowering
with simvastatin in 20,536 high-risk individuals: a randomised placebo-
controlled trial. Lancet 360:7-22
20(c)Stephen Senn 2011-2015
Corollary – unobserved covariates can
be ignored if you have randomised
• The error is to assume that because you can’t use
randomisation as a justification for ignoring
information it is useless
• It is useful for what you don’t see
• Knowing that the two-dice game is fairly run is
important even though the average probability is not
relevant to game two
• Average probabilities are important for calibrating your
inferences
o Your conditional probabilities must be coherent with your
marginal ones
 See the relationship between the games
(C) Stephen Senn 2014 21
A Red Herring
• One sometimes hears that the fact that there are
indefinitely many covariates means that
randomisation is useless
• This is quite wrong
• It is based on a misunderstanding that variant 3 of
our game should not be analysed like variant 1
• I showed you that it should
(c)Stephen Senn 2013 22
You are not free to imagine anything
at all
• Imagine that you are in
control of all the thousands
and thousands of covariates
that patients will have
• You are now going to allocate
the covariates and their
effects to patients
o As in a simulation
• If you respect the actual
variation in human health that
there can be you will find that
the net total effect of these
covariates is bounded
𝑌 = 𝛽0 + 𝑍 + 𝛽1 𝑋1 + ⋯ 𝛽 𝑘 𝑋 𝑘 + ⋯
Where Z is a treatment indicator and the
X are covariates. You are not free to
arbitrarily assume any values you like for
the Xs and the 𝛽𝑠 because the variance of
Y must be respected.
(c)Stephen Senn 2013 23
The importance of ratios
• In fact from one point of view there is only one covariate that
matters
o potential outcome
 If you know this, all other covariates are irrelevant
• And just as this can vary between groups in can vary within
• The t-statistic is based on the ratio of differences between to
variation within
• Randomisation guarantees (to a good approximation) the
unconditional behaviour of this ratio and that is all that
matters for what you can’t see (game 3)
• An example follows
(c)Stephen Senn 2013 24
Hills andArmitageEneuresis Data
10
8
14
2
12
6 1210
6
4
2
0
40 8
Drynights placebo
Line of equality
Sequence Drug Placebo
Sequence placebo drug
Cross-over trial in
Eneuresis
Two treatment periods of
14 days each
1. Hills, M, Armitage, P. The two-period
cross-over clinical trial, British Journal of Clinical
Pharmacology 1979; 8: 7-20.
25(c)Stephen Senn 2011-2015
0.7
4
0.5
2
0.3
0
0.1
-2-4
0.6
0.2
0.4
0.0
Permutatedtreatment effect
Blue diamond shows
treatment effect whether or
not we condition on patient
as a factor.
It is identical because the
trial is balanced by patient.
However the permutation
distribution is quite different
and our inferences are
different whether we
condition (red) or not
(black) and clearly
balancing the randomisation
by patient and not
conditioning the analysis by
patient is wrong
26(c)Stephen Senn 2011-2015
The two permutation* distributions
summarised
Summary statistics for Permuted
difference no blocking
Number of observations = 10000
Mean = 0.00561
Median = 0.0345
Minimum = -3.828
Maximum = 3.621
Lower quartile = -0.655
Upper quartile = 0.655
P-value for observed difference 0.0340
*Strictly speaking randomisation
distributions
Summary statistics for Permuted
difference blocking
Number of observations = 10000
Mean = 0.00330
Median = 0.0345
Minimum = -2.379
Maximum = 2.517
Lower quartile = -0.517
Upper quartile = 0.517
P-value for observed difference 0.0014
27(c)Stephen Senn 2011-2015
Two Parametric Approaches
Not fitting patient effect
Estimate s.e. t(56) t pr.
2.172 0.964 2.25 0.0282
(P-value for permutation is 0.034)
Fitting patient effect
Estimate s.e. t(28) t pr
.
2.172 0.616 3.53 0.00147
(P-value for Permutation is 0.0014)
28(c)Stephen Senn 2011-2015
What happens if you balance but
don’t condition?
Approach Variance of estimated
treatment effect over all
randomisations*
Mean of variance of
estimated treatment
effect over all
randomisations*
Completely randomised
Analysed as such
0.987 0.996
Randomised within-
patient
Analysed as such
0.534 0.529
Randomised within-
patient Analysed as
completely randomised
0.534 1.005
*Based on 10000 random permutations
(c)Stephen Senn 2011-2015 29
That is to say, permute values respecting the fact that they come from a cross-
over but analysing them as if they came from a parallel group trial
In terms of t-statistics
Approach Observed variance
of t-statistic over all
randomisations*
Predicted
theoretical variance
Completely
randomised
Analysed as such
1.027 1.037
Randomised within-
patient
Analysed as such
1.085 1.077
Randomised within-
patient Analysed as
completely
randomised
0.534 1.037@
*Based on 10000 random permutations
@ Using the common falsely assumed theory
(c)Stephen Senn 2011-2015 30
The Shocking Truth
• The validity of conventional analysis of randomised
trials does not depend on covariate balance
• It is valid because they are not perfectly balanced
• If they were balanced the standard analysis would
be wrong
(c)Stephen Senn 2011-2015 31
Myth 4
Randomisation is Not Necessary for Blinding
Fisher, in a letter to Jeffreys, explained the dangers of using a
haphazard method thus
… if I want to test the capacity of the human race for
telepathically perceiving a playing card, I might choose the
Queen of Diamonds, and get thousands of radio listeners to
send in guesses. I should then find that considerably more
than one in 52 guessed the card right... Experimentally this
sort of thing arises because we are in the habit of making
tacit hypotheses, e.g. ‘Good guesses are at random except for
a possible telepathic influence.’ But in reality it appears that
red cards are always guessed more frequently than
black(Bennett, 1990).(pp268-269)
…if the trial was, and remained, double-blind then
randomization could play no further role in this respect.
(Worrall, 2007)(P454)
32(c)Stephen Senn 2011-2015
Avoiding Double Guessing
• If you don’t randomise you have to assume
that your strategy has not been guessed by
the investigator
• You are using ‘the argument from the
stupidity of others’
• Not publishing the block size in your protocol
is a classic example
33(c)Stephen Senn 2011-2015
Myth 5
Randomisation is Inefficient
• There is a sense in which this is no myth
• Randomisation is not fully efficient
• Theory shows that there is a loss of about one
patient per factor fitted compared to a
completely balanced design
– Such completely balanced designs are not usually
possible, however
• In any case, the loss is small
34(c)Stephen Senn 2011-2015
An Example
Linear Trend in Prognosis
The figures refer to the difference in position between B and A. Of course
alternation means that the Bs are on average one place beyond the As. The other
schemes are ‘unbiased’. Since alternation and the double sandwich are
deterministic the have no variance.
It is assumed that there are 2n patients in total and that n is an even number.
35(c)Stephen Senn 2011-2015
Myth 6
Randomisation precludes balancing
• Of course we know this is
not true
• We can build strata and
randomise within them
• ‘Balance what you can
and randomise what you
can’t’ was Fisher’s recipe
36(c)Stephen Senn 2011-2015
Myth 7
Large trials are more balanced than small ones
Measure of balance Comparison large v
small (on average)
Mean difference at
baseline
Large trial is more
balanced
Total difference at
baseline
Small trial is more
balanced
Standardised
difference at
baseline
Large and small trial
equally balanced
• Large trials have narrower
confidence intervals for the
treatment effect
• The advantage of increased mean
balance in covariates has already
been consumed in the form of
narrower limits
• There is no further insurance to
be given by size
– Only increase in validity is
because closer to asymptotic
limit that guarantees Normality
37(c)Stephen Senn 2011-2015
My Philosophy of Clinical Trials
• Your (reasonable) beliefs dictate the model
• You should try measure what you think is important
• You should try fit what you have measured
– Caveat : random regressors and the Gauss-Markov theorem
• If you can balance what is important so much the better
– But fitting is more important than balancing
• Randomisation deals with unmeasured covariates
– You can use the distribution in probability of unmeasured covariates
– For measured covariates you must use the actual observed distribution
• Claiming to do ‘conservative inference’ is just a convenient
way of hiding bad practice
– Who thinks that analysing a matched pairs t as a two sample t is acceptable?
38(c)Stephen Senn 2011-2015
What’s out and What’s in
Out In
• Log-rank test
• T-test on change scores
• Chi-square tests on 2 x 2
tables
• Responder analysis and
dichotomies
• Balancing as an excuse for
not conditioning
• Proportional hazards
• Analysis of covariance
fitting baseline
• Logistic regression fitting
covariates
• Analysis of original values
• Modelling as a guide for
designs
39(c)Stephen Senn 2011-2015
Unresolved Issue
• In principle you should never be worse off by
having more information
• The ordinary least squares approach has two
potential losses in fitting covariates
– Loss of orthogonality
– Losses of degrees of freedom
• This means that eventually we lose by fitting
more covariates
40(c)Stephen Senn 2011-2015
Resolution?
• The Gauss-Markov theorem does not apply to
stochastic regressors
• In theory we can do better by having random effect
models
• However there are severe practical difficulties
• Possible Bayesian resolution in theory
• A pragmatic compromise of a limited number of
prognostic factors may be reasonable
41(c)Stephen Senn 2011-2015
To sum up
• There are a lot of people out there who fail to
understand what randomisation can and
cannot do for you
• We need to tell them firmly and clearly what
they need to understand
42(c)Stephen Senn 2011-2015
Finally
I leave you with
this thought
Statisticians are always
tossing coins but do not
own many
43(c)Stephen Senn 2011-2015

More Related Content

PPT
Is ignorance bliss
PPTX
What should we expect from reproducibiliry
PPTX
Minimally important differences v2
PPTX
NNTs, responder analysis & overlap measures
PPTX
Seventy years of RCTs
PPT
Why I hate minimisation
PPTX
What is your question
PPTX
On being Bayesian
Is ignorance bliss
What should we expect from reproducibiliry
Minimally important differences v2
NNTs, responder analysis & overlap measures
Seventy years of RCTs
Why I hate minimisation
What is your question
On being Bayesian

What's hot (20)

PPTX
The revenge of RA Fisher
PPTX
In search of the lost loss function
PPTX
Minimally important differences
PDF
The Rothamsted school meets Lord's paradox
PPT
Yates and cochran
PPTX
Numbers needed to mislead
PPTX
The revenge of RA Fisher
PPT
First in man tokyo
PPTX
De Finetti meets Popper
PPTX
Thinking statistically v3
PPTX
The Seven Habits of Highly Effective Statisticians
PPTX
Approximate ANCOVA
PPTX
Clinical trials: quo vadis in the age of covid?
PPTX
Trends towards significance
PPTX
Clinical trials: three statistical traps for the unwary
PPTX
P value wars
PPTX
Understanding randomisation
PPTX
Real world modified
PPTX
The challenge of small data
PPTX
Personalised medicine a sceptical view
The revenge of RA Fisher
In search of the lost loss function
Minimally important differences
The Rothamsted school meets Lord's paradox
Yates and cochran
Numbers needed to mislead
The revenge of RA Fisher
First in man tokyo
De Finetti meets Popper
Thinking statistically v3
The Seven Habits of Highly Effective Statisticians
Approximate ANCOVA
Clinical trials: quo vadis in the age of covid?
Trends towards significance
Clinical trials: three statistical traps for the unwary
P value wars
Understanding randomisation
Real world modified
The challenge of small data
Personalised medicine a sceptical view
Ad

Similar to Seven myths of randomisation (20)

PPT
Randomization
PPTX
Randomized Controlled Trials (RCTs)
PPTX
What is your question
PPTX
EXPERIMENTAL EPIDEMIOLOGY
PDF
Randomisation techniques
PPTX
PPTX
Understanding clinical trial's statistics
PPT
Randomized Controlled Trials
PPTX
Randomization
PPT
Weinberg-study-design-full-set.ppt
PPTX
Whatever happened to design based inference
PPT
Randomized CLinical Trail
PPTX
2024_guidline rule_r201-study design.pptx
PPTX
r201-study design for research study.pptx
PPTX
Randomization
PPT
Randomised Controlled Trials
PPTX
Clinical trial design
PPTX
Levels of evidence and design of clinical trail
PPTX
Randomized Controlled Trials.pptx
Randomization
Randomized Controlled Trials (RCTs)
What is your question
EXPERIMENTAL EPIDEMIOLOGY
Randomisation techniques
Understanding clinical trial's statistics
Randomized Controlled Trials
Randomization
Weinberg-study-design-full-set.ppt
Whatever happened to design based inference
Randomized CLinical Trail
2024_guidline rule_r201-study design.pptx
r201-study design for research study.pptx
Randomization
Randomised Controlled Trials
Clinical trial design
Levels of evidence and design of clinical trail
Randomized Controlled Trials.pptx
Ad

More from Stephen Senn (9)

PPTX
Has modelling killed randomisation inference frankfurt
PPTX
Vaccine trials in the age of COVID-19
PPTX
To infinity and beyond v2
PPT
A century of t tests
PPTX
To infinity and beyond
PPTX
In Search of Lost Infinities: What is the “n” in big data?
PPT
The story of MTA/02
PPT
Confounding, politics, frustration and knavish tricks
PPTX
And thereby hangs a tail
Has modelling killed randomisation inference frankfurt
Vaccine trials in the age of COVID-19
To infinity and beyond v2
A century of t tests
To infinity and beyond
In Search of Lost Infinities: What is the “n” in big data?
The story of MTA/02
Confounding, politics, frustration and knavish tricks
And thereby hangs a tail

Recently uploaded (20)

PDF
OSCE SERIES ( Questions & Answers ) - Set 3.pdf
PPTX
IMAGING EQUIPMENiiiiìiiiiiTpptxeiuueueur
PPTX
HYPERSENSITIVITY REACTIONS - Pathophysiology Notes for Second Year Pharm D St...
PDF
OSCE SERIES ( Questions & Answers ) - Set 5.pdf
PDF
Calcified coronary lesions management tips and tricks
PPTX
Human Reproduction: Anatomy, Physiology & Clinical Insights.pptx
PPTX
Approach to chest pain, SOB, palpitation and prolonged fever
PDF
OSCE Series ( Questions & Answers ) - Set 6.pdf
PPTX
Vaccines and immunization including cold chain , Open vial policy.pptx
PPTX
Post Op complications in general surgery
PDF
The_EHRA_Book_of_Interventional Electrophysiology.pdf
DOCX
PEADIATRICS NOTES.docx lecture notes for medical students
PPTX
Introduction to Medical Microbiology for 400L Medical Students
PPTX
Wheat allergies and Disease in gastroenterology
PDF
Lecture 8- Cornea and Sclera .pdf 5tg year
PDF
04 dr. Rahajeng - dr.rahajeng-KOGI XIX 2025-ed1.pdf
PPTX
Manage HIV exposed child and a child with HIV infection.pptx
PDF
Copy of OB - Exam #2 Study Guide. pdf
PPTX
NRP and care of Newborn.pptx- APPT presentation about neonatal resuscitation ...
PDF
Comparison of Swim-Up and Microfluidic Sperm Sorting.pdf
OSCE SERIES ( Questions & Answers ) - Set 3.pdf
IMAGING EQUIPMENiiiiìiiiiiTpptxeiuueueur
HYPERSENSITIVITY REACTIONS - Pathophysiology Notes for Second Year Pharm D St...
OSCE SERIES ( Questions & Answers ) - Set 5.pdf
Calcified coronary lesions management tips and tricks
Human Reproduction: Anatomy, Physiology & Clinical Insights.pptx
Approach to chest pain, SOB, palpitation and prolonged fever
OSCE Series ( Questions & Answers ) - Set 6.pdf
Vaccines and immunization including cold chain , Open vial policy.pptx
Post Op complications in general surgery
The_EHRA_Book_of_Interventional Electrophysiology.pdf
PEADIATRICS NOTES.docx lecture notes for medical students
Introduction to Medical Microbiology for 400L Medical Students
Wheat allergies and Disease in gastroenterology
Lecture 8- Cornea and Sclera .pdf 5tg year
04 dr. Rahajeng - dr.rahajeng-KOGI XIX 2025-ed1.pdf
Manage HIV exposed child and a child with HIV infection.pptx
Copy of OB - Exam #2 Study Guide. pdf
NRP and care of Newborn.pptx- APPT presentation about neonatal resuscitation ...
Comparison of Swim-Up and Microfluidic Sperm Sorting.pdf

Seven myths of randomisation

  • 1. Seven Myths of Randomisation in Clinical Trials Stephen Senn 1(c)Stephen Senn 2011-2015
  • 2. Why this talk • I had begun to notice that there were a number of published criticisms of randomisation in the methodology of science literature of randomisation • These seemed to be accepted as valid by others • I felt a refutation was called for 2(c)Stephen Senn 2011-2015
  • 3. The Magnificent Seven • Patients are treated simultaneously • Balance is necessary for valid inference • Observed covariates can be ignored • Randomisation is not necessary for blinding • Randomisation is inefficient • Randomisation precludes balancing • Large trials have better balance 3(c)Stephen Senn 2011-2015
  • 4. Outline • A game of chance • The seven myths • My philosophy of randomisation and analysis 4(c)Stephen Senn 2011-2015
  • 5. Game of Chance • Two dice are rolled – Red die – Black die • You have to call correctly the odds of a total score of 10 • Three variants – Game 1 You call the odds and the dice are rolled together – Game 2 the red die is rolled first, you are shown the score and then must call the odds – Game 3 the Game 2 the red die is rolled first, you are not shown the score and then must call the odds 5(c)Stephen Senn 2011-2015
  • 6. Total Score when Rolling Two Dice Variant 1. Three of 36 equally likely results give a 10. The probability is 3/36=1/12. 6(c)Stephen Senn 2011-2015
  • 7. Variant 2: If the red die score is 1,2 or 3, probability of a total of10 is 0. If the red die score is 4,5 or 6 the probability of a total of10 is 1/6. Variant 3: The probability = (½ x 0) + (½ x 1/6) = 1/12 Total Score when Rolling Two Dice 7(c)Stephen Senn 2011-2015
  • 8. The Morals • You can’t treat game 2 like game 1. – You must condition on the information you receive in order to act wisely – You must use the actual data from the red die • You can treat game 3 like game 1. – You can use the distribution in probability that the red die has • You can’t ignore an observed prognostic covariate in analysing a clinical trial just because you randomised – That would be to treat game 2 like game 1 • You can ignore an unobserved covariate precisely because you did randomise – Because you are entitled to treat game 3 like game 1 8(c)Stephen Senn 2011-2015
  • 9. Trialists continue to use their randomization as an excuse for ignoring prognostic information (myth 3), and they continue to worry about the effect of factors they have not measured (myth 2). Neither practice is logical. The Reality 9(c)Stephen Senn 2011-2015
  • 10. Myth 1: Patients are treated simultaneously If, having created groups matched with respect to those ‘known’ factors, one then goes on to decide which will be the experimental and which the control group by some random process—in the simplest case by tossing a fair coin—then one can do no epistemic harm, though one also does no further epistemic good. Worrall 2007, p463. For example, one could arrange for the matching to be performed by a panel of doctors representing a spectrum of opinion on the likely value of the drugs and whose criteria of selection have been made explicit. Urbach, 1985, p272 10(c)Stephen Senn 2011-2015
  • 11. All this is pretty obvious • The point is that it is obvious to us • It is not obvious to them – Critics of randomisation writing on clinical trials • You need to tell them to abandon the deep- freeze microwave theory of clinical trials • You can’t thaw patients out just when it suits you 11(c)Stephen Senn 2011-2015
  • 12. Myth 2: Balance is necessary for validity • It is generally held as being self evident that a trial which is not balanced is not valid. • Trials are examined at baseline to establish their validity. • In fact the matter is not so simple........... 12(c)Stephen Senn 2011-2015
  • 13. A Tale of Two Tables Trial 1 Treatment Sex Verum Placebo Total Male 34 26 60 Female 15 25 40 Total 49 51 100 Trial 2 Treatment Sex Verum Placebo Male 26 26 52 Female 15 15 30 Total 41 41 82 • Trial two balanced but trial one not • Surely trial two must be more reliable • Things are not so simple 13(c)Stephen Senn 2011-2015
  • 14. A Tale of Two Tables Trial 1 Treatment Sex Verum Placebo Total Male 26+8 26 60 Female 15 15+10 40 Total 49 51 100 Trial 2 Treatment Sex Verum Placebo Male 26 26 52 Female 15 15 30 Total 41 41 82 • Trial two contains trial one • How can more information be worse than less • If statistical theory could not deal with Trial 1 there would be something wrong with it 14(c)Stephen Senn 2011-2015
  • 15. Stratification All we need to do is compare like with like. If we compare males with males and females with females we shall obtain two unbiased estimators of the treatment effects. These can then be combined in some appropriate way. This technique is called stratification. A similar approach called analysis of covariance is available to deal with continuous covariates such as height, age or a baseline measurement. 15(c)Stephen Senn 2011-2015
  • 16. What you learn in your first regression course 1 11 1 0 1 2 12 2 1 2 1 1 1 ... 1 ... k k n n kn k n Y X X Y X X Y X X                                                          X β ε Y = Xβ +ε L M M M O M M M Y  ˆ   -1 β = X X X Y   1 2ˆ ˆ( ) , ( ) .E V    β β β XX 16(c)Stephen Senn 2011-2015
  • 17. 1 2 11 12 1 12 22 2 1 22 ˆvar( ) ( ) 2/ k k kk X X a a a a a a a a n                   The value of 2 depends on the model. For a given model, the value of a22 depends on the design and this only achieves its lower bound when covariates are balanced. The Value of Balance Variance multiplier for the treatment effect 17(c)Stephen Senn 2011-2015
  • 18. Myth 3 Observed covariates can be ignored • This is wrong whether or not covariates are imbalanced • Nobody would analyse a matched pairs design like a completely randomised design • However two classes of statisticians are implicitly signing up to this – Those who minimise – Those who use the propensity score 18(c)Stephen Senn 2011-2015
  • 19. The Problem with Minimisation • Many public sector trials are minimised but not strictly randomised – That is to say a dynamic form of balancing is employed • Often the covariates used for balancing are not fitted in the model 19(c)Stephen Senn 2011-2015
  • 20. Typical MRC Stuff ‘The central telephone randomisation system used a minimisation algorithm to balance the treatment groups with respect to eligibility criteria and other major prognostic factors.’ (p24) ‘All comparisons involved logrank analyses of the first occurrence of particular events during the scheduled treatment period after randomisation among all those allocated the vitamins versus all those allocated matching placebo capsules (ie, they were “intention-to treat” analyses).’ (p24) 1. (2002) MRC/BHF Heart Protection Study of cholesterol lowering with simvastatin in 20,536 high-risk individuals: a randomised placebo- controlled trial. Lancet 360:7-22 20(c)Stephen Senn 2011-2015
  • 21. Corollary – unobserved covariates can be ignored if you have randomised • The error is to assume that because you can’t use randomisation as a justification for ignoring information it is useless • It is useful for what you don’t see • Knowing that the two-dice game is fairly run is important even though the average probability is not relevant to game two • Average probabilities are important for calibrating your inferences o Your conditional probabilities must be coherent with your marginal ones  See the relationship between the games (C) Stephen Senn 2014 21
  • 22. A Red Herring • One sometimes hears that the fact that there are indefinitely many covariates means that randomisation is useless • This is quite wrong • It is based on a misunderstanding that variant 3 of our game should not be analysed like variant 1 • I showed you that it should (c)Stephen Senn 2013 22
  • 23. You are not free to imagine anything at all • Imagine that you are in control of all the thousands and thousands of covariates that patients will have • You are now going to allocate the covariates and their effects to patients o As in a simulation • If you respect the actual variation in human health that there can be you will find that the net total effect of these covariates is bounded 𝑌 = 𝛽0 + 𝑍 + 𝛽1 𝑋1 + ⋯ 𝛽 𝑘 𝑋 𝑘 + ⋯ Where Z is a treatment indicator and the X are covariates. You are not free to arbitrarily assume any values you like for the Xs and the 𝛽𝑠 because the variance of Y must be respected. (c)Stephen Senn 2013 23
  • 24. The importance of ratios • In fact from one point of view there is only one covariate that matters o potential outcome  If you know this, all other covariates are irrelevant • And just as this can vary between groups in can vary within • The t-statistic is based on the ratio of differences between to variation within • Randomisation guarantees (to a good approximation) the unconditional behaviour of this ratio and that is all that matters for what you can’t see (game 3) • An example follows (c)Stephen Senn 2013 24
  • 25. Hills andArmitageEneuresis Data 10 8 14 2 12 6 1210 6 4 2 0 40 8 Drynights placebo Line of equality Sequence Drug Placebo Sequence placebo drug Cross-over trial in Eneuresis Two treatment periods of 14 days each 1. Hills, M, Armitage, P. The two-period cross-over clinical trial, British Journal of Clinical Pharmacology 1979; 8: 7-20. 25(c)Stephen Senn 2011-2015
  • 26. 0.7 4 0.5 2 0.3 0 0.1 -2-4 0.6 0.2 0.4 0.0 Permutatedtreatment effect Blue diamond shows treatment effect whether or not we condition on patient as a factor. It is identical because the trial is balanced by patient. However the permutation distribution is quite different and our inferences are different whether we condition (red) or not (black) and clearly balancing the randomisation by patient and not conditioning the analysis by patient is wrong 26(c)Stephen Senn 2011-2015
  • 27. The two permutation* distributions summarised Summary statistics for Permuted difference no blocking Number of observations = 10000 Mean = 0.00561 Median = 0.0345 Minimum = -3.828 Maximum = 3.621 Lower quartile = -0.655 Upper quartile = 0.655 P-value for observed difference 0.0340 *Strictly speaking randomisation distributions Summary statistics for Permuted difference blocking Number of observations = 10000 Mean = 0.00330 Median = 0.0345 Minimum = -2.379 Maximum = 2.517 Lower quartile = -0.517 Upper quartile = 0.517 P-value for observed difference 0.0014 27(c)Stephen Senn 2011-2015
  • 28. Two Parametric Approaches Not fitting patient effect Estimate s.e. t(56) t pr. 2.172 0.964 2.25 0.0282 (P-value for permutation is 0.034) Fitting patient effect Estimate s.e. t(28) t pr . 2.172 0.616 3.53 0.00147 (P-value for Permutation is 0.0014) 28(c)Stephen Senn 2011-2015
  • 29. What happens if you balance but don’t condition? Approach Variance of estimated treatment effect over all randomisations* Mean of variance of estimated treatment effect over all randomisations* Completely randomised Analysed as such 0.987 0.996 Randomised within- patient Analysed as such 0.534 0.529 Randomised within- patient Analysed as completely randomised 0.534 1.005 *Based on 10000 random permutations (c)Stephen Senn 2011-2015 29 That is to say, permute values respecting the fact that they come from a cross- over but analysing them as if they came from a parallel group trial
  • 30. In terms of t-statistics Approach Observed variance of t-statistic over all randomisations* Predicted theoretical variance Completely randomised Analysed as such 1.027 1.037 Randomised within- patient Analysed as such 1.085 1.077 Randomised within- patient Analysed as completely randomised 0.534 1.037@ *Based on 10000 random permutations @ Using the common falsely assumed theory (c)Stephen Senn 2011-2015 30
  • 31. The Shocking Truth • The validity of conventional analysis of randomised trials does not depend on covariate balance • It is valid because they are not perfectly balanced • If they were balanced the standard analysis would be wrong (c)Stephen Senn 2011-2015 31
  • 32. Myth 4 Randomisation is Not Necessary for Blinding Fisher, in a letter to Jeffreys, explained the dangers of using a haphazard method thus … if I want to test the capacity of the human race for telepathically perceiving a playing card, I might choose the Queen of Diamonds, and get thousands of radio listeners to send in guesses. I should then find that considerably more than one in 52 guessed the card right... Experimentally this sort of thing arises because we are in the habit of making tacit hypotheses, e.g. ‘Good guesses are at random except for a possible telepathic influence.’ But in reality it appears that red cards are always guessed more frequently than black(Bennett, 1990).(pp268-269) …if the trial was, and remained, double-blind then randomization could play no further role in this respect. (Worrall, 2007)(P454) 32(c)Stephen Senn 2011-2015
  • 33. Avoiding Double Guessing • If you don’t randomise you have to assume that your strategy has not been guessed by the investigator • You are using ‘the argument from the stupidity of others’ • Not publishing the block size in your protocol is a classic example 33(c)Stephen Senn 2011-2015
  • 34. Myth 5 Randomisation is Inefficient • There is a sense in which this is no myth • Randomisation is not fully efficient • Theory shows that there is a loss of about one patient per factor fitted compared to a completely balanced design – Such completely balanced designs are not usually possible, however • In any case, the loss is small 34(c)Stephen Senn 2011-2015
  • 35. An Example Linear Trend in Prognosis The figures refer to the difference in position between B and A. Of course alternation means that the Bs are on average one place beyond the As. The other schemes are ‘unbiased’. Since alternation and the double sandwich are deterministic the have no variance. It is assumed that there are 2n patients in total and that n is an even number. 35(c)Stephen Senn 2011-2015
  • 36. Myth 6 Randomisation precludes balancing • Of course we know this is not true • We can build strata and randomise within them • ‘Balance what you can and randomise what you can’t’ was Fisher’s recipe 36(c)Stephen Senn 2011-2015
  • 37. Myth 7 Large trials are more balanced than small ones Measure of balance Comparison large v small (on average) Mean difference at baseline Large trial is more balanced Total difference at baseline Small trial is more balanced Standardised difference at baseline Large and small trial equally balanced • Large trials have narrower confidence intervals for the treatment effect • The advantage of increased mean balance in covariates has already been consumed in the form of narrower limits • There is no further insurance to be given by size – Only increase in validity is because closer to asymptotic limit that guarantees Normality 37(c)Stephen Senn 2011-2015
  • 38. My Philosophy of Clinical Trials • Your (reasonable) beliefs dictate the model • You should try measure what you think is important • You should try fit what you have measured – Caveat : random regressors and the Gauss-Markov theorem • If you can balance what is important so much the better – But fitting is more important than balancing • Randomisation deals with unmeasured covariates – You can use the distribution in probability of unmeasured covariates – For measured covariates you must use the actual observed distribution • Claiming to do ‘conservative inference’ is just a convenient way of hiding bad practice – Who thinks that analysing a matched pairs t as a two sample t is acceptable? 38(c)Stephen Senn 2011-2015
  • 39. What’s out and What’s in Out In • Log-rank test • T-test on change scores • Chi-square tests on 2 x 2 tables • Responder analysis and dichotomies • Balancing as an excuse for not conditioning • Proportional hazards • Analysis of covariance fitting baseline • Logistic regression fitting covariates • Analysis of original values • Modelling as a guide for designs 39(c)Stephen Senn 2011-2015
  • 40. Unresolved Issue • In principle you should never be worse off by having more information • The ordinary least squares approach has two potential losses in fitting covariates – Loss of orthogonality – Losses of degrees of freedom • This means that eventually we lose by fitting more covariates 40(c)Stephen Senn 2011-2015
  • 41. Resolution? • The Gauss-Markov theorem does not apply to stochastic regressors • In theory we can do better by having random effect models • However there are severe practical difficulties • Possible Bayesian resolution in theory • A pragmatic compromise of a limited number of prognostic factors may be reasonable 41(c)Stephen Senn 2011-2015
  • 42. To sum up • There are a lot of people out there who fail to understand what randomisation can and cannot do for you • We need to tell them firmly and clearly what they need to understand 42(c)Stephen Senn 2011-2015
  • 43. Finally I leave you with this thought Statisticians are always tossing coins but do not own many 43(c)Stephen Senn 2011-2015