Compute Approximate Information from Sample Size: Continuous & Binary Outcomes

These functions provide an asymptotic approximation to the information (i.e. precision, inverse of the variance) provided by two samples under assumed values of nuisance parameters for continuous and binary outcomes. This includes a difference in means for continuous outcomes, a difference in proportions (i.e. risk difference) for a binary outcome, or a relative risk (i.e. risk ratio) for a binary outcome. These functions may be useful in pre-trial planning to determine when analyses may occur under different assumptions about the nuisance parameters involved.

Usage

asymptotic_information_difference_means(n_0, sigma_0, n_1, sigma_1)

asymptotic_information_difference_proportions(n_0, pi_0, n_1, pi_1)

asymptotic_information_risk_difference(...)

asymptotic_information_relative_risk(n_0, pi_0, n_1, pi_1)

Arguments

n_0: A numeric vector containing the sample size in the control arm.
sigma_0: Variance of outcomes in the population of individuals receiving the control intervention
n_1: A numeric vector containing the sample size in the treatment arm.
sigma_1: Variance of outcomes in the population of individuals receiving the active intervention
pi_0: Probability of event in the population of individuals receiving the control intervention
pi_1: Probability of event in the population of individuals receiving the control intervention
...: Arguments passed to asymptotic_information_difference_proportions

Value

When all parameters are scalars, the result is a scalar, indicating the approximate information. When multiple values are specified, a grid of unique parameters are constructed, and the approximate information is computed for each value of the parameters.

Details

The amount of information in a sample of size $N$ depends on nuisance parameters, such as the variance of continuous outcomes, the risk of binary and time-to-event outcomes, rates of misssing data, and the correlation between covariates and the outcomes of interest.

In studies with a fixed sample size, this sample size is chosen based on assumptions about these nuisance parameters, which are incorporated into the effect size. The sample size is chosen to give power $(1 - \beta)$ while maintaining a type I error rate of $(\alpha)$ under some assumed effect size. Inaccurate estimates of nuisance parameters can lead to over-powered or under-powered studies.

In an information-monitored design, investigators choose an estimand of interest, such as the difference in means or proportions, that is free from nuisance parameters. A trial is designed to identify some minimum important difference $\delta_{min}$ in the estimand with power $(1 - \beta)$ while maintaining a type I error rate of $(\alpha)$. Data is collected until the precision of the estimate (i.e. the reciprocal of its variance) reaches a pre-specified threshold $\mathcal{I}$:

$$\mathcal{I} = \left(\frac{Z_{\alpha/s} + Z_{\beta}}{\delta_{min}}\right)^2 \approx \frac{1}{Var(\hat{\delta})} = \frac{1}{\left(SE(\hat{\delta})\right)^2}$$

The sample size required to reach the information target $\mathcal{I}$ depends on nuisance parameters mentioned above.

These functions allow a user to determine an approximate amount of information contained in a sample of size $N$ based on some assumptions about the nuisance parameters.

References

Mehta, CR, and Tsiatis AA. 2001. "Flexible Sample Size Considerations Using Information-Based Interim Monitoring." Drug Information Journal 35 (4): 1095–1112. https://doi.org/10.1177/009286150103500407Mehta CR, Gao P, Bhatt DL, Harrington RA, Skerjanec S, and Ware JH. 2009. "Optimizing Trial Design: Sequential, Adaptive, and Enrichment Strategies." Circulation 119 (4): 597–605. https://doi.org/10.1161/circulationaha.108.809707.

Examples

# When a single value is supplied for each parameter, a scalar is returned:
asymptotic_information_difference_means(
  n_0 = 50,
  sigma_0 = 5,
  n_1 = 50,
  sigma_1 = 5
)
#> [1] 1

asymptotic_information_difference_proportions(
  n_0 = 20,
  pi_0 = 0.2,
  n_1 = 20,
  pi_1 = 0.1
)
#> [1] 80

asymptotic_information_relative_risk(
  n_0 = 20,
  pi_0 = 0.2,
  n_1 = 20,
  pi_1 = 0.1
)
#> [1] 1.538462

# When multiple values are supplied for one or more parameters, the grid of
# parameters are created, and a data.frame is returned.
asymptotic_information_difference_means(
  n_0 = c(50, 75),
  sigma_0 = 5,
  n_1 = c(50, 75),
  sigma_1 = 5
)
#>   n_0 sigma_0 n_1 sigma_1 information_asymptotic
#> 1  50       5  50       5                    1.0
#> 2  75       5  50       5                    1.2
#> 3  50       5  75       5                    1.2
#> 4  75       5  75       5                    1.5

asymptotic_information_difference_proportions(
  n_0 = c(20, 40),
  pi_0 = 0.2,
  n_1 = c(20, 40),
  pi_1 = 0.1
)
#>   n_0 pi_0 n_1 pi_1 information_asymptotic
#> 1  20  0.2  20  0.1               80.00000
#> 2  40  0.2  20  0.1              117.64706
#> 3  20  0.2  40  0.1               97.56098
#> 4  40  0.2  40  0.1              160.00000

asymptotic_information_relative_risk(
  n_0 = c(20, 40),
  pi_0 = 0.2,
  n_1 = c(20, 40),
  pi_1 = 0.1
)
#>   n_0 pi_0 n_1 pi_1 information_asymptotic
#> 1  20  0.2  20  0.1               1.538462
#> 2  40  0.2  20  0.1               1.818182
#> 3  20  0.2  40  0.1               2.352941
#> 4  40  0.2  40  0.1               3.076923