Designing Information Monitored Trials for Binary Outcomes
Source:vignettes/design_binary.Rmd
design_binary.Rmd
Planning The Study
Planning an information-monitored study is similar in many respects to planning a study with a fixed sample size. Investigators must decide on the target of statistical inference, also known as an estimand: for a binary outcome, the risk difference is usually of interest, although investigators may also be interested in a risk ratio (relative risk), or odds ratio. Once the estimand is chosen, decisions must be made about what constitutes a meaningful effect size on the scale of the estimand (e.g. a 10% absolute reduction in risk or a relative risk of 0.75). Finally, the characteristics of the testing procedure must be specified, including the desired Type I Error Rate, statistical power, and direction of alternatives of interest:
# Universal Study Design Parameters
minimum_difference <- 0.15 # Effect Size: An absolute risk difference of
alpha <- 0.05 # Type I Error Rate
power <- 0.9 # Statistical Power
test_sides <- 2 # Direction of Alternatives
The amount of data that must be collected depends on the amount of the information in the accruing data, with the information depending on the patterns of associations among variables, variability in the outcomes of interest, and degree of missingness. Such information is not always available when studies are being planned in practice.
Fixed sample size designs require investigators make assumptions about the factors affecting precision: when such assumptions are incorrect, studies can be over- or under-powered. Rather than planning data collection until a pre-specified sample size is reached, an information-monitored study continues data collection until the data collected provide enough precision to identify a meaningful difference with appropriate power and control of Type I Error.
Determining the Target Information Level
The information or precision required to achieve power to identify a treatment effect with an -sided test with type I error rate at the final analysis is given by:
# Determine information required to achieve desired power at fixed error rate
information_single_stage <-
impart::required_information_single_stage(
delta = minimum_difference,
alpha = alpha,
power = power
)
information_single_stage
#> [1] 466.9966
For example, detecting a risk difference difference of 0.15 with 90% power and a Type I Error rate of 0.05 using a 2-sided test requires an information level of 466.9965805. Investigators can collect data until the precision (the reciprocal of the square of the standard error) reaches this level, and their analysis will have the appropriate power and Type I error control.
Translating Information into Sample Size
Translating information levels to a sample size requires making some
assumption about nuisance parameters, such as the probability of the
outcomes in the control arm. The
information_to_n_binary_1_to_1
function takes an
information level and values of the nuisance parameters, and gives an
approximate sample size. Note that this calculation only takes into
account the information contained in the observed outcomes: if some
outcomes are missing, or if the analysis makes use of information in
baseline covariates and intermediate outcomes, this can change the
sample size at which the target information level is reached.
# Assume control probability = 0.25
approximate_n_p0_0.25 <-
impart::information_to_n_binary_1_to_1(
information = information_single_stage,
pi_0 = 0.25,
delta = minimum_difference,
round_up = TRUE
)
approximate_n_p0_0.25
#> n_per_arm n_total
#> 1 200 400
# Compute Power: Fixed Sample Size
pwr::pwr.2p.test(
h = pwr::ES.h(p1 = 0.25 + minimum_difference, p2 = 0.25),
sig.level = alpha,
power = power
)
#>
#> Difference of proportion power calculation for binomial distribution (arcsine transformation)
#>
#> h = 0.3222409
#> n = 202.3787
#> sig.level = 0.05
#> power = 0.9
#> alternative = two.sided
#>
#> NOTE: same sample sizes
# Assume control probability = 0.42
approximate_n_p0_0.42 <-
impart::information_to_n_binary_1_to_1(
information = information_single_stage,
pi_0 = 0.42,
delta = minimum_difference,
round_up = TRUE
)
approximate_n_p0_0.42
#> n_per_arm n_total
#> 1 229 458
# Compute Power: Fixed Sample Size
pwr::pwr.2p.test(
h = pwr::ES.h(p1 = 0.42 + minimum_difference, p2 = 0.42),
sig.level = alpha,
power = power
)
#>
#> Difference of proportion power calculation for binomial distribution (arcsine transformation)
#>
#> h = 0.3011521
#> n = 231.7151
#> sig.level = 0.05
#> power = 0.9
#> alternative = two.sided
#>
#> NOTE: same sample sizes
Note that under specific assumptions about the outcome probability in
the control arm are met, the sample size requirements determined using
the calculation fixed sample size design (pwr::pwr.2p.test
)
or the calculation information-monitored design
(information_to_n_binary_1_to_1
) are similar. The slight
differences are due to differences in the approximation (a normal
approximation versus the arcsine transformation).
The advantage of an information-adaptive design is the sample size adapts to the information in the accruing data: rather than making assumptions about the outcome probability in the control arm, which could result in an over- or under-powered trial, data collection proceeds until the target information level is met, which ensures adequate power and Type I Error control.
Sequential Analyses in Studies
If the true effect of interest is greater than the minimum meaningful effect , the study may still be overpowered. Conversely, if the true effect is very small, or indicates that the benefits of participating in the study are not commensurate with risks, it may be futile to continue data collection. In such cases, interim analyses of the data can be used to guide more ethical, cost-effective data collection.
Group-Sequential Designs allow investigators to control Type I Error
rates when performing pre-specified interim assessments of the
differences between groups. Studies can also be stopped early for
futility if accruing data suggest that a treatment is ineffective or
harmful. The number and timing of analyses must be pre-specified, as
well as the rules for stopping for efficacy and futility. The stopping
rules are specified using ‘spending functions:’ alpha spending functions
define efficacy stopping rules, and beta spending functions define
futility stopping rules. For more information on group sequential
designs, see the documentation for the
RPACT
package. This example will utilize the
O’Brien-Fleming stopping rules for efficacy and futility.
In contrast to a group sequential design, which performs analyses at pre-specified fractions of the final sample size, an information-monitored study performs analyses when the data collected provide enough precision to identify a treatment effect with the appropriate power and Type I Error. Analyses are conducted when the precision reaches pre-specified fractions of this level of precision.
# Group Sequential Design Parameters
information_rates <-
c(0.50, 0.75, 1.00) # Analyses at 50%, 75%, and 100% of the Total Information
type_of_design <- "asOF" # O'Brien-Fleming Alpha Spending
type_beta_spending <- "bsOF" # O'Brien-Fleming Beta Spending
The getDesignGroupSequential
function in the
rpact
library can be used to specify the appropriate study
design. For example, a two-sided test comparing
vs.
# Set up group sequential testing procedure
trial_design <-
rpact::getDesignGroupSequential(
alpha = alpha,
beta = 1 - power,
sided = 2,
informationRates = information_rates,
typeOfDesign = type_of_design,
typeBetaSpending = type_beta_spending,
bindingFutility = FALSE
)
For a one-sided test using RPACT, the default boundaries assume users are testing vs. :
# One sided test: Higher values = Better
trial_design_one_sided_upper <-
rpact::getDesignGroupSequential(
alpha = alpha,
beta = 1 - power,
sided = 1,
informationRates = information_rates,
typeOfDesign = type_of_design,
typeBetaSpending = type_beta_spending,
bindingFutility = FALSE
)
plot(
trial_design_one_sided_upper,
main = "One-Sided: Higher Z = Superior"
)
Users must must manually
correct the resulting design boundaries to test
vs. .
Until this correction is made, correct_one_sided_gsd()
should be called on the result, which corrects the futility and efficacy
boundaries:
trial_design_one_sided_lower <-
correct_one_sided_gsd(
trial_design = trial_design_one_sided_upper,
higher_better = FALSE
)
plot(
trial_design_one_sided_lower,
main = "Corrected One-Sided: Lower Z = Superior"
)
Once this issue is resolved, users can specify the direction using
the directionUpper
argument in other RPACT functions:
-
directionUpper = TRUE
tests vs. -
directionUpper = FALSE
tests vs.
# One sided test: Higher values = Better
trial_design_one_sided_higher <-
rpact::getDesignGroupSequential(
alpha = alpha,
beta = 1 - power,
sided = 1,
informationRates = information_rates,
typeOfDesign = type_of_design,
typeBetaSpending = type_beta_spending,
bindingFutility = FALSE,
directionUpper = FALSE
)
# One sided test: Lower values = Better
trial_design_one_sided_lower <-
rpact::getDesignGroupSequential(
alpha = alpha,
beta = 1 - power,
sided = 1,
informationRates = information_rates,
typeOfDesign = type_of_design,
typeBetaSpending = type_beta_spending,
directionUpper = TRUE
)
Adjusting Information for Multiple Analyses
When doing sequential analyses in an information-monitored design, the target level of information must be adjusted:
# Inflate information level to account for multiple testing
information_adaptive <-
impart::required_information_sequential(
information_single_stage = information_single_stage,
trial_design = trial_design
)
information_adaptive
#> [1] 506.4762
The information required under the specified design is 506.4762081,
which is scaled up by the inflation factor mentioned in the summary of
the design (1.0845394). This can be retrieved using
rpact::getDesignCharacteristics(trial_design)
.
Including Covariate Information
Appropriately including information from covariates has the potential to increase the precision of analyses, meaning that target information levels can be reached at lower sample sizes, resulting in studies with shorter duration.
# Information Only From Final Outcomes
impart::information_to_n_binary_1_to_1(
information = information_rates*information_adaptive,
pi_0 = 0.25,
pi_1 = 0.25 + minimum_difference,
round_up = FALSE
)
#> information pi_0 pi_1 n_per_arm n_total
#> 1 253.2381 0.25 0.4 108.2593 216.5186
#> 2 379.8572 0.25 0.4 162.3889 324.7779
#> 3 506.4762 0.25 0.4 216.5186 433.0372
# 10% Relative Efficiency Increase from Covariates
relative_efficiency <- 1.1
# Information From Final Outcomes + Covariates
impart::information_to_n_binary_1_to_1(
information = information_rates*information_adaptive/relative_efficiency,
pi_0 = 0.25,
pi_1 = 0.25 + minimum_difference,
round_up = TRUE
)
#> information pi_0 pi_1 n_per_arm n_total
#> 1 230.2165 0.25 0.4 99 198
#> 2 345.3247 0.25 0.4 148 296
#> 3 460.4329 0.25 0.4 197 394
# 20% Relative Efficiency Increase from Covariates
relative_efficiency <- 1.2
# Information From Final Outcomes + Covariates
impart::information_to_n_binary_1_to_1(
information = information_rates*information_adaptive/relative_efficiency,
pi_0 = 0.25,
pi_1 = 0.25 + minimum_difference,
round_up = TRUE
)
#> information pi_0 pi_1 n_per_arm n_total
#> 1 211.0318 0.25 0.4 91 182
#> 2 316.5476 0.25 0.4 136 272
#> 3 422.0635 0.25 0.4 181 362
The increase in precision from covariate adjustment is never known precisely during the planning of a study. Instead of relying on an assumption about the gain in precision from covariates, investigators can use information monitoring to adapt data collection to the accruing information from both covariates and outcomes.
Encapsulating Study Design
A monitored_design
object encapsulates all of the
information about an information monitored study that should be fixed at
the outset, including the number of analyses, the information fractions
at which analyses are conducted, the target level of information, the
null value of the estimand of interest, the maximum feasible sample
size, and the pseudorandom number generator seed which is used for
analyses.
# Initialize the monitored design
monitored_design <-
initialize_monitored_design(
trial_design = trial_design,
null_value = 0,
maximum_sample_size = 600,
information_target = information_adaptive,
orthogonalize = TRUE,
rng_seed_analysis = 54321
)
monitored_design
#> $original_design
#> $original_design$trial_design
#> *Design parameters and output of group sequential design*
#>
#> *User defined parameters*
#>
#> * *Type of design*: O'Brien & Fleming type alpha spending
#> * *Information rates*: 0.500, 0.750, 1.000
#> * *Significance level*: 0.0500
#> * *Type II error rate*: 0.1000
#> * *Test*: two-sided
#> * *Type of beta spending*: O'Brien & Fleming type beta spending
#>
#> *Derived from user defined parameters*
#>
#> * *Maximum number of stages*: 3
#>
#> *Default parameters*
#>
#> * *Stages*: 1, 2, 3
#> * *Two-sided power*: FALSE
#> * *Binding futility*: FALSE
#> * *Beta adjustment*: TRUE
#> * *Tolerance*: 1e-08
#>
#> *Output*
#>
#> * *Power*: 0.2825, 0.7164, 0.9000
#> * *Futility bounds (non-binding)*: 0.387, 1.282
#> * *Cumulative alpha spending*: 0.003051, 0.019299, 0.050000
#> * *Cumulative beta spending*: 0.02001, 0.05752, 0.10000
#> * *Critical values*: 2.963, 2.359, 2.014
#> * *Stage levels (one-sided)*: 0.001525, 0.009162, 0.022000
#>
#>
#> $original_design$maximum_sample_size
#> [1] 600
#>
#> $original_design$null_value
#> [1] 0
#>
#> $original_design$information_target
#> [1] 506.4762
#>
#> $original_design$orthogonalize
#> [1] TRUE
#>
#> $original_design$rng_seed_analysis
#> [1] 54321
#>
#> $original_design$information_fractions
#> [1] 0.50 0.75 1.00
#>
#> $original_design$information_thresholds
#> [1] 253.2381 379.8572 506.4762
#>
#>
#> attr(,"class")
#> [1] "monitored_design"