Compute Approximate Information from Sample Size: Mann-Whitney Estimand

This function provides an asymptotic approximation to the information (i.e. precision, inverse of the variance) provided by two samples under assumed values of nuisance parameters for an ordinal outcome, analyzed using the Mann-Whitney estimand. This can be obtained by specifying the value of the Mann-Whitney estimand or the probability mass functions of outcomes in each treatment arm. These functions may be useful in pre-trial planning to determine when analyses may occur under different assumptions about the nuisance parameters involved.

Usage

asymptotic_information_mann_whitney_fm(
  n_0,
  n_1,
  mw = NULL,
  pmf_1 = NULL,
  pmf_0 = NULL,
  adjust = TRUE
)

mw_from_pmfs(pmf_0, pmf_1, reverse_scale = FALSE)

Arguments

n_0: A numeric vector containing the sample size in the control arm.
n_1: A numeric vector containing the sample size in the treatment arm.
mw: A numeric vector containing the Mann-Whitney estimand.
pmf_1: A numeric vector or matrix of row vectors, each containing the probability mass function of outcomes in the population of individuals receiving the active intervention.
pmf_0: A numeric vector or matrix of row vectors, each containing the probability mass function of outcomes in the population of individuals receiving the control intervention.
adjust: A logical scalar, indicating whether an adjustment for ties should be performed. Note: this can only be computed when pmf_0 and pmf_1 are supplied.
reverse_scale: A logical scalar: should the scales be reversed when calculating the Mann-Whitney estimand? This may be useful when lower categories indicate a preferable outcome.

Value

When all parameters are scalars, the result is a scalar, indicating the approximate information. When multiple values are specified, a grid of unique parameters are constructed, and the approximate information is computed for each value of the parameters.

Details

The amount of information in a sample of size $N$ depends on nuisance parameters, such as the variance of continuous outcomes, the risk of binary and time-to-event outcomes, rates of misssing data, and the correlation between covariates and the outcomes of interest.

In studies with a fixed sample size, this sample size is chosen based on assumptions about these nuisance parameters, which are incorporated into the effect size. The sample size is chosen to give power $(1 - \beta)$ while maintaining a type I error rate of $(\alpha)$ under some assumed effect size. Inaccurate estimates of nuisance parameters can lead to over-powered or under-powered studies.

In an information-monitored design, investigators choose an estimand of interest, such as the difference in means or proportions, that is free from nuisance parameters. A trial is designed to identify some minimum important difference $\delta_{min}$ in the estimand with power $(1 - \beta)$ while maintaining a type I error rate of $(\alpha)$. Data is collected until the precision of the estimate (i.e. the reciprocal of its variance) reaches a pre-specified threshold $\mathcal{I}$:

$$\mathcal{I} = \left(\frac{Z_{\alpha/s} + Z_{\beta}}{\delta_{min}}\right)^2 \approx \frac{1}{Var(\hat{\delta})} = \frac{1}{\left(SE(\hat{\delta})\right)^2}$$

The sample size required to reach the information target $\mathcal{I}$ depends on nuisance parameters mentioned above.

These functions allow a user to determine an approximate amount of information contained in a sample of size $N$ based on some assumptions about the nuisance parameters.

References

Fay, MP and Malinovsky, Y. 2018. "Confidence Intervals of the Mann-Whitney Parameter That Are Compatible with the Wilcoxon-Mann-Whitney Test." Statistics in Medicine 37 (27): 3991–4006. https://doi.org/10.1002/sim.7890.Zhao, YD, Rahardja D, and Qu Y. 2007. "Sample Size Calculation for the Wilcoxon–Mann–Whitney Test Adjusting for Ties." Statistics in Medicine 27 (3): 462–68. https://doi.org/10.1002/sim.2912.Benkeser, D, Díaz, I, Luedtke, A, Segal, J, Scharfstein, D, and Rosenblum, M. 2020. "Improving Precision and Power in Randomized Trials for COVID-19 Treatments Using Covariate Adjustment, for Binary, Ordinal, and Time-to-Event Outcomes." Biometrics 77 (4): 1467–81. https://doi.org/10.1111/biom.13377.

Examples

# When a single value is supplied for each parameter, a scalar is returned:
asymptotic_information_mann_whitney_fm(
    n_0 = 100,
    n_1 = 100,
    mw = 0.75,
    adjust = FALSE
  )
#> [1] 0.001185536

# When multiple values are supplied for one or more parameters, the grid of
# parameters are created, and a data.frame is returned.
asymptotic_information_mann_whitney_fm(
  n_0 = c(100, 150),
  n_1 = c(100, 150),
  mw = 0.75,
  adjust = FALSE
)
#>   n_0 n_1   mw t information_asymptotic
#> 1 100 100 0.75 1           0.0011855357
#> 2 150 100 0.75 1           0.0009867857
#> 3 100 150 0.75 1           0.0009867857
#> 4 150 150 0.75 1           0.0007888095


# Specifying PMFs - With and Without Tie Adjustment
asymptotic_information_mann_whitney_fm(
  n_0 = 100,
  n_1 = 100,
  pmf_0 = c(0.2, 0.2, 0.6),
  pmf_1 = c(0.1, 0.1, 0.8),
  adjust = TRUE
)
#> [1] 0.001036432

# Specifying Multiple PMFs
asymptotic_information_mann_whitney_fm(
  n_0 = 100,
  n_1 = 100,
  pmf_0 =
    rbind(
      c(0.2, 0.2, 0.6),
      c(0.3, 0.1, 0.6)
    ),
  pmf_1 =
    rbind(
      c(0.1, 0.1, 0.8),
      c(0.05, 0.05, 0.9)
      ),
  adjust = TRUE
)
#>   n_0 n_1 pmf_0 pmf_1    mw         t pmf_0_1 pmf_0_2 pmf_0_3 pmf_1_1 pmf_1_2
#> 1 100 100     1     1 0.600 0.6502663     0.2     0.2     0.6    0.10    0.10
#> 2 100 100     2     1 0.610 0.6480162     0.3     0.1     0.6    0.10    0.10
#> 3 100 100     1     2 0.650 0.5742331     0.2     0.2     0.6    0.05    0.05
#> 4 100 100     2     2 0.655 0.5723581     0.3     0.1     0.6    0.05    0.05
#>   pmf_1_3 information_asymptotic
#> 1     0.8           0.0010364315
#> 2     0.8           0.0010218898
#> 3     0.9           0.0008578564
#> 4     0.9           0.0008481421