distribution: Test statistic distribution under the null
In depower: Power Analysis for Differential Expression Studies

distribution

R Documentation

Test statistic distribution under the null

Description

Constructs a list which defines the test statistic reference distribution under the null hypothesis.

Usage

asymptotic()

simulated(method = "approximate", nsims = 1000L, ncores = 1L, ...)

Arguments

`method`	(Scalar string: `"approximate"`) The method used to derive the distribution of the test statistic under the null hypothesis. Must be one of `"approximate"` (default) or `"exact"`. See 'Details' for additional information.
`nsims`	(Scalar integer: `1000L`; `⁠[2, Inf)⁠`) The number of resamples for `method = "approximate"`. Not used for `method = "exact"`, except for the case when the number of exact resamples exceeds approximately `1e6` and then `method = "approximate"` will be used as a fallback. In the `power()` context, `nsims` defines the number of simulated datasets under the null hypothesis. For this case you would typically set `nsims` as greater than or equal to the number of simulated datasets in the design row of the power analysis. See 'Details' for additional information.
`ncores`	(Scalar integer: `1L`; `⁠[1, Inf)⁠`) The number of cores (number of worker processes) to use. Do not set greater than the value returned by `parallel::detectCores()`.
`...`	Optional arguments for internal use.

Details

The default asymptotic test is performed for distribution = asymptotic().

When setting argument distribution = simulated(method = "exact"), the exact randomization test is defined by:

Independent two-sample tests
1. Calculate the observed test statistic.
2. Check if length(combn(x=n1+n2, m=n1))<1e6
  1. If TRUE continue with the exact randomization test.
  2. If FALSE revert to the approximate randomization test.
3. For all combn(x=n1+n2, m=n1) permutations:
  1. Assign corresponding group labels.
  2. Calculate the test statistic.
4. Calculate the exact randomization test p-value as the mean of the logical vector resampled_test_stats >= observed_test_stat.
Dependent two-sample tests
1. Calculate the observed test statistic.
2. Check if npairs < 21 (maximum 2^20 resamples)
  1. If TRUE continue with the exact randomization test.
  2. If FALSE revert to the approximate randomization test.
3. For all 2^npairs permutations:
  1. Assign corresponding pair labels.
  2. Calculate the test statistic.
4. Calculate the exact randomization test p-value as the mean of the logical vector resampled_test_stats >= observed_test_stat.

For argument distribution = simulated(method = "approximate"), the approximate randomization test is defined by:

Independent two-sample tests
1. Calculate the observed test statistic.
2. For nsims iterations:
  1. Randomly assign group labels.
  2. Calculate the test statistic.
3. Insert the observed test statistic to the vector of resampled test statistics.
4. Calculate the approximate randomization test p-value as the mean of the logical vector resampled_test_stats >= observed_test_stat.
Dependent two-sample tests
1. Calculate the observed test statistic.
2. For nsims iterations:
  1. Randomly assign pair labels.
  2. Calculate the test statistic.
3. Insert the observed test statistic to the vector of resampled test statistics.
4. Calculate the approximate randomization test p-value as the mean of the logical vector resampled_test_stats >= observed_test_stat.

In the power analysis setting, power(), we can simulate data for groups 1 and 2 using their known distributions under the assumptions of the null hypothesis. Unlike above where nonparametric randomization tests are performed, in this setting approximate parametric tests are performed.

For example, power(wald_test_nb(distribution = simulated())) would result in an approximate parametric Wald test defined by:

For each relevant design row in data:
1. For simulated(nsims=integer()) iterations:
  1. Simulate new data for group 1 and group 2 under the null hypothesis.
  2. Calculate the Wald test statistic, \chi^2_{null}.
2. Collect all \chi^2_{null} into a vector.
3. For each of the sim_nb(nsims=integer()) simulated datasets:
  1. Calculate the Wald test statistic, \chi^2_{obs}.
  2. Calculate the p-value based on the empirical null distribution of test statistics, \chi^2_{null}. (the mean of the logical vector null_test_stats >= observed_test_stat)
4. Collect all p-values into a vector.
5. Calculate power as sum(p <= alpha) / nsims.
Return all results from power().

Randomization tests use the positive-biased p-value estimate in the style of \insertCitedavison_1997;textualdepower (see also \insertCitephipson_2010;textualdepower):

\hat{p} = \frac{1 + \sum_{i=1}^B \mathbb{I} \{\chi^2_i \geq \chi^2_{obs}\}}{B + 1}.

The number of resamples defines the minimum observable p-value (e.g. nsims=1000L results in min(p-value)=1/1001). It's recommended to set \text{nsims} \gg \frac{1}{\alpha}.

Value

list

References

\insertRef

davison_1997depower

\insertRef

phipson_2010depower

Examples

#----------------------------------------------------------------------------
# asymptotic() examples
#----------------------------------------------------------------------------
library(depower)

set.seed(1234)
data <- sim_nb(
  n1 = 60,
  n2 = 40,
  mean1 = 10,
  ratio = 1.5,
  dispersion1 = 2,
  dispersion2 = 8
)

data |>
  wald_test_nb(distribution = asymptotic())

#----------------------------------------------------------------------------
# simulated() examples
#----------------------------------------------------------------------------
data |>
  wald_test_nb(distribution = simulated(nsims = 200L))

depower documentation built on Nov. 5, 2025, 5:21 p.m.