Clinical trial monitoring with
Bayesian hypothesis testing

        John D. Cook
       Valen E. Johnson


        August 6, 2008
Estimation and Testing




      Bayesians typically approach a clinical trial as an estimation
      problem, not a test.
      Possible explanation: poor operating characteristics . . .
      Unless you choose your alternative prior well.
Local prior operating characteristics




       Point null hypothesis versus alternative prior that assigns
       positive probability to the null
       When simulating from the alternative, Bayes factor in favor of
       alternative grows like e n .
       When simulating from the null, Bayes factor in favor of null
       grows like n1/2 .
       Hard to ever reject the null.
Inverse moment priors (iMOM)




                 0.0   0.2    0.4    0.6    0.8    1.0




        π1 (θ) ∝ (θ − θ0 )−ν−1 exp −λ(θ − θ0 )−2k [θ > θ0 ]
iMOM Convergence rates


  When simulating from alternative

                    p lim n−1 log BFn (1|0) = c > 0.
                      n→∞

  (Well known result.)



  When simulating from null,

                  p lim n−k/(k+1) log BFn (1|0) = c < 0.
                   n→∞

  (New result.)
Thall-Simon method



      Historical standard: θS ∼ Beta(aS , bS ). Parameters aS and
      bS large.
      Experimental treatment: θE ∼ Beta(aE , bE ) a priori, aE and
      bE small.
      Stop for inferiority if P(θE < δ + θS | data) is large.
      Stop for superiority if P(θE > θS | data) is large.
      Operating characteristics degrade without δ > 0.
      Inconsistent in limit: both stopping rules could apply.
Thall-Simon plot




                   0.0    0.2    0.4    0.6    0.8       1.0




   Beta(60, 140) historical, Beta(12, 18) experimental
Comparing Bayes factor with Thall-Simon



   Historical response 20%, alternative 30%. Fifty patients maximum.



   Bayes factor design:
       H0 : θ = 0.2
       H1 : iMOM prior with mode 0.3.
       Stop for inferiority if P(H0 | data) > 0.9.
       Stop for superiority if P(H1 | data) > 0.9.
Comparing Bayes factor with Thall-Simon, cont.



   Thall-Simon design:
       θS ∼ Beta(200,800)
       θE ∼ Beta(0.6, 1.4) a priori
       Stop for inferiority if P(θS > 0.1 + θE | data) > 0.976.
       Stop for superiority if P(θE > θS | data) > 0.99.
   Calibrated to match probability of stopping for wrong reason at
   null and alternative.
Stopping for inferiority




                                                    1.0
                                                          q     q

                                                                    q


                                                    0.8
            probability of concluding inferiority




                                                                        q

                                                                                                            q   Thall−Simon
                                                    0.6




                                                                                                                Bayes factor


                                                                            q
                                                    0.4




                                                                                  q
                                                    0.2




                                                                                      q

                                                                                          q
                                                                                              q
                                                    0.0




                                                                                                    q   q   q   q     q   q   q   q     q   q   q   q



                                                          0.0               0.2               0.4               0.6               0.8               1.0

                                                                                      true response probability
Stopping for superiority




                                                   1.0
                                                                                                           q   q     q   q   q   q     q   q   q   q
                                                                                                       q
                                                                                                   q



                                                                                             q
           probability of concluding superiority

                                                   0.8



                                                                                         q                 q   Thall−Simon
                                                   0.6




                                                                                                               Bayes factor
                                                   0.4




                                                                                     q
                                                   0.2




                                                                                 q



                                                                           q
                                                   0.0




                                                                       q
                                                         q     q   q



                                                         0.0               0.2               0.4               0.6               0.8               1.0

                                                                                     true response probability
Thall-Wooten time-to-event method




      Analogous to Thall-Simon method for binary outcomes.
      t | θ ∼ exponential with mean θ, θ ∼ inverse gamma
      Stop for inferiority if P(θS + 0.1 > θE | data) large . . .
      Stop for superiority if P(θE > θS | data) large
Comparing Bayes factor and Thall-Wooten method


   Standard treatment 6 months PFS, alternative 8 months,
   maximum 50 patients



   Bayes factor design:
       H0 : θ = 6
       H1 : iMOM prior with mode 8.
       Stop for inferiority if P(H0 | data) > 0.9.
       Stop for superiority if P(H1 | data) > 0.9.
Comparing Bayes factor and Thall-Wooten method, cont.



   Thall-Wooten design:
       θS ∼ Inverse Gamma (20,1200)
       θE ∼ Inverse Gamma(3, 12) a priori
       Stop for inferiority if P(θS + 2 > θE | data) > 0.976.
       Stop for superiority if P(θE > θS | data) > 0.93.
   Calibrated to match probability of stopping for wrong reason at
   null and alternative.
Stopping for inferiority




                                                            1.0
                                                                  q   q   q   q   q   q
            probability of early stopping for inferiority                                 q



                                                                                              q
                                                            0.8




                                                                                                  q
                                                            0.6




                                                                                                                          q   Thall−Wooten
                                                                                                                              Bayes factor
                                                                                                      q
                                                            0.4




                                                                                                          q
                                                            0.2




                                                                                                              q
                                                                                                                  q
                                                                                                                      q
                                                                                                                          q
                                                                                                                              q   q    q   q   q   q
                                                            0.0




                                                                  2               4               6               8               10               12

                                                                                          true mean survival time
Stopping for superiority




                                                           1.0
                                                                                                                                          q   q   q
                                                                                                                                 q    q
                                                                                                                             q
           probability of early stopping for superiority
                                                                                                                         q

                                                                                                                     q
                                                           0.8


                                                                                                                 q
                                                           0.6




                                                                                                             q
                                                           0.4




                                                                                                         q


                                                                                                                         q   Thall−Wooten
                                                           0.2




                                                                                                     q                       Bayes factor

                                                                                                 q
                                                                                             q
                                                           0.0




                                                                 q   q   q   q   q   q   q



                                                                 2               4               6               8               10               12

                                                                                         true mean survival time
Comparison with Simon two-stage design



   Simon two-stage design to test null response rate 0.20 versus
   alternative rate 0.40.



   Reject 95% of the time under null, 20% under alternative.



   Maximum of 43 patients: 13 in first stage, 30 in second stage.
Comparison with Simon two-stage design:
rejection probability


                                               1.0
                                                     q     q   q   q
                                                                       q
                                               0.8

                                                                             q
          probability of rejecting treatment




                                                                                                       q   Simon
                                               0.6




                                                                                 q                         Bayes factor
                                               0.4




                                                                                     q
                                               0.2




                                                                                         q


                                                                                               q

                                                                                                   q
                                                                                                       q
                                               0.0




                                                                                                           q     q   q   q   q     q   q   q   q



                                                     0.0               0.2               0.4               0.6               0.8               1.0

                                                                                 true response probability
Comparison with Simon two-stage design:
patients used


                                                                                q     q   q   q   q     q   q   q   q
                                                                            q
                     40                                                 q
                                                                    q
                                                              q

                                                          q
                     30




                                                      q
          patients




                                                  q
                     20




                                            q
                                                                            q   Simon
                                        q                                       Bayes factor
                                    q
                                                                                Naive Simon
                                q
                     10




                          q
                     0




                          0.0               0.2               0.4               0.6               0.8               1.0

                                                      true response probability
References




      Valen E. Johnson, John D. Cook. Bayesian Design of
      Single-Arm Phase II Clinical Trials with Continuous
      Monitoring. Clinical Trials 2009; 6(3):217-26.
      Software: http://biostatistics.mdanderson.org
      http://www.JohnDCook.com
Bayesian hypothesis testing

More Related Content

PDF
Detecting Drug Effects in the Brain
PPT
G1 hb 2011 2012 23 excitation-contraction coupling (steendijk)
PPTX
28 july for comenius podeschi
PPT
PDF
US Fashion Design Law 2012 Update
PDF
Kundesegmentering. Brdr. Hartmann ruller ud i hele Europa, ABC Softwork best ...
PDF
Front matter
PDF
2015 AHP International Conference session - Operations Opportunities
Detecting Drug Effects in the Brain
G1 hb 2011 2012 23 excitation-contraction coupling (steendijk)
28 july for comenius podeschi
US Fashion Design Law 2012 Update
Kundesegmentering. Brdr. Hartmann ruller ud i hele Europa, ABC Softwork best ...
Front matter
2015 AHP International Conference session - Operations Opportunities

Viewers also liked (10)

PPT
Mother's day by galli claudio
PDF
Unified Communications
PPTX
ABC Breakfast Club m Solar: Opret politikker for ind- og udfasning
PPTX
N tier enterpriseappswithacs_10252012
PDF
Sorrento Scenario Autumn 2011 newsletter
PPT
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
PPTX
The 'New-Normal' in Management Education at AIMA, 20th March, 2014
PPSX
Happy Mother’S Day
PDF
Alaa-Mattar-TOC
PPTX
ABC Dream Team - Skab dit analytiske dream team, ABC Softwork best practice
Mother's day by galli claudio
Unified Communications
ABC Breakfast Club m Solar: Opret politikker for ind- og udfasning
N tier enterpriseappswithacs_10252012
Sorrento Scenario Autumn 2011 newsletter
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
The 'New-Normal' in Management Education at AIMA, 20th March, 2014
Happy Mother’S Day
Alaa-Mattar-TOC
ABC Dream Team - Skab dit analytiske dream team, ABC Softwork best practice
Ad

Similar to Bayesian hypothesis testing (8)

PDF
Slides alexander-mcneil
PDF
Slides mcneil
PDF
Baggerly presentation from CSE
PDF
GLMs and extensions in R
PDF
Sequence learning under incidental conditions [poster]
PDF
Sequence learning under incidental conditions [poster]
XLS
Chapter 8
PDF
Estadística
Slides alexander-mcneil
Slides mcneil
Baggerly presentation from CSE
GLMs and extensions in R
Sequence learning under incidental conditions [poster]
Sequence learning under incidental conditions [poster]
Chapter 8
Estadística
Ad

More from John Cook (6)

PDF
Bayesian adaptive clinical trials: Promises and pitfalls
PDF
Erasure Coding Costs and Benefits
PDF
Combining Intuition and Data
PDF
Monte Carlo and quasi-Monte Carlo integration
PDF
Fast coputation of Phi(x) inverse
PDF
Bayesian clinical trials: software and logistics
Bayesian adaptive clinical trials: Promises and pitfalls
Erasure Coding Costs and Benefits
Combining Intuition and Data
Monte Carlo and quasi-Monte Carlo integration
Fast coputation of Phi(x) inverse
Bayesian clinical trials: software and logistics

Bayesian hypothesis testing

  • 1. Clinical trial monitoring with Bayesian hypothesis testing John D. Cook Valen E. Johnson August 6, 2008
  • 2. Estimation and Testing Bayesians typically approach a clinical trial as an estimation problem, not a test. Possible explanation: poor operating characteristics . . . Unless you choose your alternative prior well.
  • 3. Local prior operating characteristics Point null hypothesis versus alternative prior that assigns positive probability to the null When simulating from the alternative, Bayes factor in favor of alternative grows like e n . When simulating from the null, Bayes factor in favor of null grows like n1/2 . Hard to ever reject the null.
  • 4. Inverse moment priors (iMOM) 0.0 0.2 0.4 0.6 0.8 1.0 π1 (θ) ∝ (θ − θ0 )−ν−1 exp −λ(θ − θ0 )−2k [θ > θ0 ]
  • 5. iMOM Convergence rates When simulating from alternative p lim n−1 log BFn (1|0) = c > 0. n→∞ (Well known result.) When simulating from null, p lim n−k/(k+1) log BFn (1|0) = c < 0. n→∞ (New result.)
  • 6. Thall-Simon method Historical standard: θS ∼ Beta(aS , bS ). Parameters aS and bS large. Experimental treatment: θE ∼ Beta(aE , bE ) a priori, aE and bE small. Stop for inferiority if P(θE < δ + θS | data) is large. Stop for superiority if P(θE > θS | data) is large. Operating characteristics degrade without δ > 0. Inconsistent in limit: both stopping rules could apply.
  • 7. Thall-Simon plot 0.0 0.2 0.4 0.6 0.8 1.0 Beta(60, 140) historical, Beta(12, 18) experimental
  • 8. Comparing Bayes factor with Thall-Simon Historical response 20%, alternative 30%. Fifty patients maximum. Bayes factor design: H0 : θ = 0.2 H1 : iMOM prior with mode 0.3. Stop for inferiority if P(H0 | data) > 0.9. Stop for superiority if P(H1 | data) > 0.9.
  • 9. Comparing Bayes factor with Thall-Simon, cont. Thall-Simon design: θS ∼ Beta(200,800) θE ∼ Beta(0.6, 1.4) a priori Stop for inferiority if P(θS > 0.1 + θE | data) > 0.976. Stop for superiority if P(θE > θS | data) > 0.99. Calibrated to match probability of stopping for wrong reason at null and alternative.
  • 10. Stopping for inferiority 1.0 q q q 0.8 probability of concluding inferiority q q Thall−Simon 0.6 Bayes factor q 0.4 q 0.2 q q q 0.0 q q q q q q q q q q q q 0.0 0.2 0.4 0.6 0.8 1.0 true response probability
  • 11. Stopping for superiority 1.0 q q q q q q q q q q q q q probability of concluding superiority 0.8 q q Thall−Simon 0.6 Bayes factor 0.4 q 0.2 q q 0.0 q q q q 0.0 0.2 0.4 0.6 0.8 1.0 true response probability
  • 12. Thall-Wooten time-to-event method Analogous to Thall-Simon method for binary outcomes. t | θ ∼ exponential with mean θ, θ ∼ inverse gamma Stop for inferiority if P(θS + 0.1 > θE | data) large . . . Stop for superiority if P(θE > θS | data) large
  • 13. Comparing Bayes factor and Thall-Wooten method Standard treatment 6 months PFS, alternative 8 months, maximum 50 patients Bayes factor design: H0 : θ = 6 H1 : iMOM prior with mode 8. Stop for inferiority if P(H0 | data) > 0.9. Stop for superiority if P(H1 | data) > 0.9.
  • 14. Comparing Bayes factor and Thall-Wooten method, cont. Thall-Wooten design: θS ∼ Inverse Gamma (20,1200) θE ∼ Inverse Gamma(3, 12) a priori Stop for inferiority if P(θS + 2 > θE | data) > 0.976. Stop for superiority if P(θE > θS | data) > 0.93. Calibrated to match probability of stopping for wrong reason at null and alternative.
  • 15. Stopping for inferiority 1.0 q q q q q q probability of early stopping for inferiority q q 0.8 q 0.6 q Thall−Wooten Bayes factor q 0.4 q 0.2 q q q q q q q q q q 0.0 2 4 6 8 10 12 true mean survival time
  • 16. Stopping for superiority 1.0 q q q q q q probability of early stopping for superiority q q 0.8 q 0.6 q 0.4 q q Thall−Wooten 0.2 q Bayes factor q q 0.0 q q q q q q q 2 4 6 8 10 12 true mean survival time
  • 17. Comparison with Simon two-stage design Simon two-stage design to test null response rate 0.20 versus alternative rate 0.40. Reject 95% of the time under null, 20% under alternative. Maximum of 43 patients: 13 in first stage, 30 in second stage.
  • 18. Comparison with Simon two-stage design: rejection probability 1.0 q q q q q 0.8 q probability of rejecting treatment q Simon 0.6 q Bayes factor 0.4 q 0.2 q q q q 0.0 q q q q q q q q q 0.0 0.2 0.4 0.6 0.8 1.0 true response probability
  • 19. Comparison with Simon two-stage design: patients used q q q q q q q q q q 40 q q q q 30 q patients q 20 q q Simon q Bayes factor q Naive Simon q 10 q 0 0.0 0.2 0.4 0.6 0.8 1.0 true response probability
  • 20. References Valen E. Johnson, John D. Cook. Bayesian Design of Single-Arm Phase II Clinical Trials with Continuous Monitoring. Clinical Trials 2009; 6(3):217-26. Software: http://biostatistics.mdanderson.org http://www.JohnDCook.com