A review and re-interpretation of a group-sequential approach to sample size re-estimation in two-stage trials.
Bottom Line: In this paper, we review the adaptive design methodology of Li et al. (Biostatistics 3:277-287) for two-stage trials with mid-trial sample size adjustment.We argue that it is closer in principle to a group sequential design, in spite of its obvious adaptive element.Several extensions are proposed that aim to make it even more attractive and transparent alternative to a standard (fixed sample size) trial for funding bodies to consider.
Affiliation: MRC Biostatistics Unit, Cambridge, UK.Show MeSH
Mentions: Figure 3 highlights the operating characteristics of the original fixed design proposal (n = 129, α = 0.025, β = 0.2) and adaptive designs 1 and 2 as a function of δ. They are calculated from the list of expressions given in Table S1 in the appendix (available online as Supporting Information). Figure 3 (top left) shows, for adaptive designs 1 and 2, how the probability of stopping for efficacy or futility changes as δ increases from 0 to 1. The two probabilities are equal when δ equals the mid-point of and . Under design 1, the probability that the total sample size is greater than the maximum of 140 per arm is maximised at around 26% when δ equals the mid-point of and . The same value of δ maximises the probability that n2 = nmax under design 2. Figure 3 (top right) shows that the expected sample size of designs 1 and 2 is always less than that of the fixed design. The maximum expected sample size of design 2 is over 20 patients less than that of design 1. Figure 3 (bottom left) shows the overall unconditional power, P(Reject H0), of all three designs. Formula (12) in Table S1 gives this quantity, as well as a more standard formula for the power of the fixed design. The fixed design's overall power is greater than the adaptive designs for all reasonable values of δ. At the originally hypothesised value , the overall power is 80% by definition, whereas adaptive designs 1 and 2 only achieve an overall power of ≈ 71% and 69%, respectively. This shortcoming of the adaptive designs is returned to in Section 4. Figure 3 (bottom right) shows the ratio of the design's overall power with their expected sample size (which is of course constant for the fixed design). Comparisons of power between designs with different expected sample sizes can be misleading, so the power per unit of expected sample size provides a new and potentially useful standardised measure. Indeed, it has been recently employed by the second author to compare the relative merits of competing development strategies for phase II trials 14. Despite design 2 being the least powerful of the three, it is the superior of the three for all values of δ according to this measure. It also highlights how unnecessarily large the fixed design is when δ is over 0.5.