Statistical Considerations for Clinical Trials During COVID-19: Confirmatory Adaptive Platform Trial (CAPT) Design for COVID-19 Treatment (Part II)
Authors: Qing Liu, Ph.D., Quantitative and Regulatory Medical Science, LLC and Karl Peace, Ph.D., Jiann-Ping Hsu College of Public Health, Georgia Southern University
Foundation of Adaptive Designs
Proschan and Hunsberger (1995) introduced a two-stage adaptive design for sample size adjustment using conditional error functions. Research in application of the two-stage adaptive design for regulatory use started at the FDA in 1997 when the first author of this article was the statistical reviewer of a new drug application (IND) submission by a pharmaceutical company. This led to the first regulatory agency article by Chi and Liu (1999) with a formal introduction of the concept of a conditional rejection rule. This was followed by a technical paper by Liu and Chi (2001) who provided a framework of design operational properties involving minimum effect size, and consequently developed a highly efficient sample size adjustment procedure with only 21% mark-up of the maximum sample size compared to that of the fixed sample size design. Liu and Chi (2001) also developed methods for statistical inference (i.e., p-value, point estimates and confidence intervals). A detailed description of early regulatory history of modern adaptive designs was given in Liu and Chi (2010).
Müller and Schäfer (2001) showed how to calculate the conditional error probabilities as functions of the interim outcomes for group sequential designs. Early, Posch and Bauer (1999) and Wassmer (1999) showed a one-to-one correspondence between two-stage combination tests and conditional rejection rules.
To land adaptive designs on a solid probabilistic foundation, Liu, Proschan and Pledger (2002) provided a unified framework of two-stage adaptive designs with general measure-theoretic formulation of conditional error functions and adaptation rules. The assumption of a conditional error function is that it is measurable with respect to a sigma-field, and the assumption required of an adaptation rule is that it is measurable with respect to the same sigma-field and has a countable range. With this broad framework, the theory for hypothesis testing, point estimation and confidence intervals was developed. The theory was further strengthened by Brannath, Gutjahr and Bauer (2012). Thus, under the rigorous measure-theoretic formulation unplanned design modifications are justified.
It was thought that it was necessary to specify a two-stage adaptive design for which stage 1 is clearly defined. Liu and Chi (2010) showed that there is no need to even specify a two-stage adaptive design, rather the need for a two-stage adaptive design or specifics of design features can be determined based on blinded review of accumulating data. Their work was based on an actual clinical trial for European regulatory submission that started as a traditional fixed sample size design and was later turned into an adaptive design. Following agreement by the Committee for Proprietary Medical Products (CPMP) on unplanned design modification, the trial continued according to the modification, which led to unequivocal, interpretable results with successful approvals by various regulatory agencies.
Interestingly, blinded data review to make trial modifications are strongly supported by FDA guidance documents. The 2006 FDA Guidance on Establishment and Operation of Clinical Trial Data Monitoring Committees states that
“When a DMC is the only group reviewing unblinded interim data, trial organizers faced with compelling new information external to the trial may consider making changes in the ongoing trial without raising concerns that such changes might have been at least partly motivated by knowledge of the interim data and thereby endanger trial integrity. Sometimes accumulating data from within the trial (e.g., overall event rates) may suggest the need for modifications.”
The 2019 FDA guidance on adaptive designs states that
“Accumulating outcome data can provide a useful basis for trial adaptations. The analysis of outcome data without using treatment assignment is sometimes called pooled analysis.”
And
“In general, adequately prespecified adaptations based on non-comparative data have no effect or a limited effect on the Type I error probability. This makes them an attractive choice in many settings, particularly when uncertainty about event probabilities or endpoint variability is high.”
Statistical Methods for CAPT Design
CAPT Design Schema
The following is a simple schema of a trial following the CAPT design for which each sequential randomization involves only two treatment groups. An actual trial can start with two treatment arms (e.g., CATT) or multiple treatment arms (CATCO) for the first randomization. During the trial, the second randomization can be adaptively specified on the emergency of potential new treatments and accumulating blinded or unblinded data from the first randomization. As these trials can last till 2022, it is possible that a third or fourth randomization will be added.
Figure 1: CAPT Design with Two Treatment Groups for Each Randomization
Adaptive Group Sequential Design
An adaptive group sequential design will be used for each randomization. There are two ways to set up an adaptive group sequential design. The traditional approach is to specify a group sequential design before starting the randomization for which group sequential boundaries can be specified to allow stopping the trial early for efficacy, futility, or safety concerns. Based on results of the interim analysis, various unplanned modifications, including sample size adjustment, changing endpoints, or adding subgroup analysis, can be specified by the conditional rejection test for which the conditional error function is calculated by the method of Müller and Schäfer (2001).
Following Liu and Chi (2010), a better approach would be to start the randomization following a fixed sample size design, and then decide based upon a blinded review of necessary details or changes, which include sample size adjustment and specific details of a group sequential design. If a group sequential design is added, then unplanned design modifications can still be made based on unblinded interim results following Müller and Schäfer (2001).
The advantage of the latter approach is to shift some unplanned modifications with blinded data review without paying penalty and to simplify unplanned modifications with an unblinded interim analysis. This would lead to a more efficient design with simplified statistical analysis. Of course, for randomizations that are pre-planned with group sequential designs, blinded data review can still be used to override the group sequential design prior to the first interim analysis.
Sample Size Adjustment
The most controversial aspect of adaptive designs is sample size adjustment (see Section of Liu and Chi, 2010). Notably, the proposed adaptive design by Cui, Hung and Wang (1999) has received a plethora of criticisms for its inefficiency. The epitome of all criticisms of adaptive designs was led by Tsiatis and Mehta (2003), who showed that for every two-stage sample size adaptive design it is possible to construct a group sequential design with higher power. Cui, Hung and Wang (1999), like many authors of adaptive sample size procedures, use the observed effect size, which is deeply rooted on the flawed conditional power at the current trend calculation. To avoid these difficulties, Chi and Liu (1999) and Liu and Chi (2001) introduced the concept of using a minimum effect size and proposed a set of operating properties for sample size adjustment. As a result, they developed a highly efficient sample size adjustment procedure with only 21% mark-up of the maximum sample size compared to that of the fixed sample size design. Proschan, Liu and Hunsberger (2003) showed that the variance of the observed effect size is often at least 30 times larger than the variance of an effect size allowing only estimation of the nuisance parameters.
To optimize sample size calculation, Liu, et. al. (2012) developed an efficient two-stage sample size adaptive design. The design is generated with a pseudo group sequential design, which provides the first-stage sample size, the significance and nonbinding futility boundary values for the interim analysis, the minimum sample size for the second stage, and, last but not least, the sample size adaptation rule. By construction, the two-stage sample size adaptive design inherits many operating properties of the pseudo group sequential design and is therefore very efficient. To further improve the efficiency, they proposed a likelihood ratio test, as well as inference methods for sequential p-values, confidence intervals, and median and mean unbiased estimates.
This process of generating a two-stage sample size adaptive design is called the forward process. Liu, et. al. (2012) also created a reversal process for generating a derived pseudo group sequential design from a two-stage adaptive design. The forward process is then applied to the derived pseudo group sequential design to obtain a new generation two-stage adaptive design. Combining the forward and reversal processes may lead to optimization of a conforming two-stage adaptive design. Through numerical calculations with a motivating example, they show that a two-stage adaptive design generated by the forward process can have smaller expected information under an alternative hypothesis than a pseudo group sequential design, and that a new generation two-stage adaptive design can reduce the expected information levels over the previous generation for both the null hypothesis and an alternative hypothesis. Through an example, Liu et al. (2012) show that when there is substantial over-entry in patient enrollment, the adaptive design can be more efficient than a group sequential design. The final two-stage adaptive design is very efficient with the low ratio f = 1.193 of the maximum information to the naïve information of the traditional design, which is below the recommended lower range of 1.2 by Liu and Chi (2001).
For numerical illustration, we compare the efficient two-stage sample size adaptive design, or efficient adaptive design (EAD) here in, with the promising zone design proposed by Mehta and Pocock (2011) using their schizophrenia trial example. A mean score difference of 2 and a standard deviation of 7.5 is used to calculate the sample size of 442 with a 80% power at the 1-sided significance level of 0.025 (Plan 1). The “promising zone” design (Plan 4) attempts to protect power for the clinically meaningful difference of 1.6. The interim sample size is 208, the minimum sample size is 442 and the maximum sample size is 884. Mehta and Pocock (2011) report the operating characteristics of Plan 1 and 4, which are given in Table 1. To match the operating characteristics of Plan 4, we also consider a simple design with a fixed sample size of 492 patients (See Plan 5). It is seen that Plan 5 is clearly superior to Plan 4 in terms of power and expected sample size. Note that at the effect size 2, the sample size 492 of Plan 5 is just 1 patient larger than the expected sample size of 491. However, the power 84.1% under Plan 5 is also larger than the power of 83% under Plan 4. As the maximum sample size under Plan 4 is 884, the ratio to Plan 5 is 884/492 = 1.7978. Thus, the ``mark-up’’ rate of the maximum sample size is a staggering 79.8% while the overall power is only 65%.
For the schizophrenia trial example considered in Mehta and Pocock (2011), the key operating characteristics of an efficient adaptive design (Plan 6) following Liu et. al. (2012) are given in Table 2, along with those of the fixed sample size design (Plan 2) and the group sequential design (Plan 3). The efficient adaptive design (Plan 6) uses a maximum sample size of 826, which is smaller than 884 of the “promising zone” design (Plan 4). Both the interim sample size 208 and the minimum sample size 442 are identical to those of Plan 4. The power of the efficient adaptive design matches those of the fixed sample size design (Plan 2) and group sequential design (Plan 3). The efficient adaptive design outperforms both designs in terms of the expected sample size. The gain in the reduction of the expected sample size is at the cost of increasing the maximum sample size from 690 of the fixed sample size design to 826, which is a mark-up of 19.7%. Table 2 also provides the probability of adapting to the maximum small size, which varies from 58% to 47% depending on the effect size.
Following the numerical illustration, an efficient adaptive design may be useful when enrollment is substantial over-run as demonstrated by ACTT first randomization.
Continue to Part III
Part III link: https://lnkd.in/eSUStZh
Return to Part I
Part I link: https://lnkd.in/ejYYTZN
References
1. Proschan, M. A. and Hunsberger, S. A. (1995). Designed extension of studies based on conditional power. Biometrics 51, 1315–1324.
2. Chi, G. Y. H. and Liu, Q. (1999). The attractiveness of the concept of a prospectively designed two-stage clinical trial. Journal of Biopharmaceutical Statistics 9, 537–547.
3. Liu, Q. and Chi, G. Y. H. (2001). On sample size and inference for two-stage adaptive designs. Biometrics 57, 172–177.
4. Liu, Q. and Chi, G. Y. H. (2010). Understanding the FDA guidance on adaptive designs: Historical, legal, and statistical perspectives. Journal of Biopharmaceutical Statistics. Special issue on adaptive designs, 20, 1178-1219.
5. FDA (2006). Guidance on establishment and operation of clinical trial data monitoring committees. https://www.fda.gov/media/75398/download
6. FDA (2019). Adaptive design clinical trials for drugs and biologics: Guidance for industry. https://www.fda.gov/media/78495/download
7. Müller, H. and Schäfer, H. (2001). Adaptive group sequential designs for clinical trials: Combining the advantages of adaptive and of classical group sequential approaches. Biometrics 57, 886–891.
8. Posch, M. and Bauer P. (1999). Adaptive two stage designs and the conditional error function. Biometrical Journal 41, 689–696.
9. Wassmer, G. (1998). A comparison of two methods for adaptive interim analyses in clinical trials. Biometrics 54, 696–705.
10. Liu, Q., Proschan, M. A., and Pledger, G. W. (2002). A unified theory of two-stage adaptive designs. Journal of the American Statistical Association 97, 1034–1041.
11. Brannath, W., Gutjahr, G. and Bauer P. (2012). Probabilistic foundation of confirmatory adaptive designs. Journal of the American Statistical Association 107, 824–832.
12. Cui, L., Hung, H. M. J., and Wang, S. J. (1999). Modification of sample size in group sequential clinical trials. Biometrics 55, 853–857.
13. Tsiatis, A. A., and Mehta, C. (2003). On the inefficiency of the adaptive design for monitoring clinical trials. Biometrika 90, 367–378.
14. Proschan, M. A., Liu, Q., and Hunsberger, S. A. (2003). Practical midcourse sample size modification in clinical trials. Controlled Clinical Trials 24, 4–15.
15. Liu, Q., Li, G., Anderson, K. M., and Lim, P. (2012). On efficient two-stage adaptive designs for clinical trials with sample size adjustment. Journal of Biopharmaceutical Statistics, Special issue on adaptive designs. 22, 617-640
16. Mehta, C. R. and Pocock, S. J. (2011). Adaptive increase in sample size when interim results are promising: A practical guide with examples. Statistics in Medicine 30, 3267-3284.
Copyright 2020 Media | QRMedSci, LLC.