Introduction

The package includes functions for power analysis and sample size calculation for Welch and Hsu [@Hedderich2018] t tests including Monte-Carlo simulations of empirical power and type-I-error. In addition, power and sample size calculation for Wilcoxon rank sum and signed rank tests via Monte-Carlo simulations can be performed. Moreover, power and sample size required for the evaluation of a diagnostic test(-system) [@Flahault2005; @Dobbin2007] as well as for a single proportion [@Fleiss2003], comparing two negative binomial rates [@Zhu2014], ANCOVA [@Shieh2020], reference ranges [@Jennen2005], and multiple primary endpoints [@Sozu2015].

We first load the package.

library(MKpower)

Single Proportion

The computation of the required sample size for investigating a single proportion can be determined via the respective test or confidence interval [@Fleiss2003]. First, we consider the asymptotic test.

## with continuity correction
power.prop1.test(p1 = 0.4, power = 0.8)
## without continuity correction
power.prop1.test(p1 = 0.4, power = 0.8, cont.corr = FALSE)

Next, we compute the sample size via the respective asymptotic confidence interval.

## with continuity correction
ssize.propCI(prop = 0.4, width = 0.14)
## without continuity correction
ssize.propCI(prop = 0.4, width = 0.14, method = "wald")

Welch Two-Sample t-Test

For computing the sample size of the Welch t-test, we only consider the situation of equal group size (balanced design).

## identical results as power.t.test, since sd = sd1 = sd2 = 1
power.welch.t.test(n = 20, delta = 1)
power.welch.t.test(power = .90, delta = 1)
power.welch.t.test(power = .90, delta = 1, alternative = "one.sided")

## sd1 = 0.5, sd2 = 1
power.welch.t.test(delta = 1, sd1 = 0.5, sd2 = 1, power = 0.9)

Hsu Two-Sample t-Test

For computing the sample size of the Hsu t-test [@Hedderich2018], we only consider the situation of equal group size (balanced design).

## slightly more conservative than Welch t-test
power.hsu.t.test(n = 20, delta = 1)
power.hsu.t.test(power = .90, delta = 1)
power.hsu.t.test(power = .90, delta = 1, alternative = "one.sided")

## sd1 = 0.5, sd2 = 1
power.welch.t.test(delta = 0.5, sd1 = 0.5, sd2 = 1, power = 0.9)
power.hsu.t.test(delta = 0.5, sd1 = 0.5, sd2 = 1, power = 0.9)

ANCOVA

With function power.ancova one can compute power and sample size in ANCOVA designs [@Shieh2020]. The default matrix of contrasts used in the function is

## 3 groups
cbind(rep(1,2), -diag(2))
## 4 groups
cbind(rep(1,3), -diag(3))

We use the example provided in Table 9.7 of @Maxwell2004.

## Example from Maxwell and Delaney (2004) according to Shieh (2020)
power.ancova(mu=c(7.5366, 11.9849, 13.9785), var = 29.0898, power = 0.8)
power.ancova(n = rep(45/3, 3), mu=c(7.5366, 11.9849, 13.9785), var = 29.0898)
power.ancova(mu=c(7.5366, 11.9849, 13.9785), var = 29.0898, power = 0.9)
power.ancova(n = rep(57/3, 3), mu=c(7.5366, 11.9849, 13.9785), var = 29.0898)

Based on the reported adjusted group means and error variance, a (total) sample size of 45 is required to achieve a power of at least 80%. The calculated power is 82.2%. With a sample size of 57 the power will be at least 90%, where the calculated power is 91.2%.

Introducing additional covariates (random covariate model) will increase the required sample size.

power.ancova(mu=c(7.5366, 11.9849, 13.9785), var = 29.0898, power = 0.8,
             nr.covs = 2)
power.ancova(mu=c(7.5366, 11.9849, 13.9785), var = 29.0898, power = 0.8,
             nr.covs = 3)
power.ancova(mu=c(7.5366, 11.9849, 13.9785), var = 29.0898, power = 0.8,
             nr.covs = 5)
power.ancova(mu=c(7.5366, 11.9849, 13.9785), var = 29.0898, power = 0.8,
             nr.covs = 10)

We can also calculate the per group sample sizes for an imbalanced design.

power.ancova(mu=c(7.5366, 11.9849, 13.9785), var = 29.0898, power = 0.8, 
             group.ratio = c(1, 1.25, 1.5))
power.ancova(n = c(13, 16, 19), mu=c(7.5366, 11.9849, 13.9785), var = 29.0898,  
             group.ratio = c(1, 1.25, 1.5))
power.ancova(mu=c(7.5366, 11.9849, 13.9785), var = 29.0898, power = 0.8, 
             group.ratio = c(1, 0.8, 2/3))
power.ancova(n = c(17, 14, 12), mu=c(7.5366, 11.9849, 13.9785), var = 29.0898,  
             group.ratio = c(1, 0.8, 2/3))

We get a smaller total sample size with an imbalanced design (43 instead of 45).

Wilcoxon Rank Sum and Signed Rank Tests

For computing the sample size of these tests, we offer a function that performs Monte-Carlo simulations to determine their (empirical) power.

rx <- function(n) rnorm(n, mean = 0, sd = 1) 
ry <- function(n) rnorm(n, mean = 0.5, sd = 1) 
## two-sample
sim.ssize.wilcox.test(rx = rx, ry = ry, n.max = 100, iter = 1000)
sim.ssize.wilcox.test(rx = rx, ry = ry, n.min = 65, n.max = 70, 
                      step.size = 1, iter = 1000, BREAK = FALSE)
power.t.test(delta = 0.5, power = 0.8)

## one-sample
sim.ssize.wilcox.test(rx = ry, n.max = 100, iter = 1000, type = "one.sample")
sim.ssize.wilcox.test(rx = ry, type = "one.sample", n.min = 33, 
                      n.max = 38, step.size = 1, iter = 1000, BREAK = FALSE)
power.t.test(delta = 0.5, power = 0.8, type = "one.sample")

## two-sample
rx <- function(n) rgamma(n, scale = 2, shape = 1.5)
ry <- function(n) rgamma(n, scale = 4, shape = 1.5) # equal shape
## two-sample
sim.ssize.wilcox.test(rx = rx, ry = ry, n.max = 100, iter = 1000)
sim.ssize.wilcox.test(rx = rx, ry = ry, n.min = 25, n.max = 30, 
                      step.size = 1, iter = 1000, BREAK = FALSE)

For practical applications we recommend to use a higher number of iterations. For more detailed examples we refer to the help page of the function.

Two Negative Binomial Rates

When we consider two negative binomial rates [@Zhu2014], we can compute sample size or power applying function power.nb.test.

## examples from Table III in Zhu and Lakkis (2014)
power.nb.test(mu0 = 5.0, RR = 2.0, theta = 1/0.5, duration = 1, power = 0.8, approach = 1)
power.nb.test(mu0 = 5.0, RR = 2.0, theta = 1/0.5, duration = 1, power = 0.8, approach = 2)
power.nb.test(mu0 = 5.0, RR = 2.0, theta = 1/0.5, duration = 1, power = 0.8, approach = 3)

Diagnostic Test

Given an expected sensitivity and specificity we can compute sample size, power, delta or significance level of diagnostic test [@Flahault2005].

## see n2 on page 1202 of Chu and Cole (2007)
power.diagnostic.test(sens = 0.99, delta = 0.14, power = 0.95) # 40
power.diagnostic.test(sens = 0.99, delta = 0.13, power = 0.95) # 43
power.diagnostic.test(sens = 0.99, delta = 0.12, power = 0.95) # 47

The sample size planning for developing binary classifiers in case of high dimensional data, we can apply function ssize.pcc, which is based on the probability of correct classification (PCC) [@Dobbin2007].

## see Table 2 of Dobbin et al. (2008)
g <- 0.1
fc <- 1.6
ssize.pcc(gamma = g, stdFC = fc, nrFeatures = 22000)

Reference Range

We can apply function ssize.reference.range to determine the sample size required for a study planned to establish a reference range. The parametric approach assumes a normal distribution whereas the non-parametric approach only assumes a continuous distribution.

ssize.reference.range(delta = 0.01, ref.prob = 0.95, conf.prob = 0.9, 
                      method = "parametric", exact = TRUE)
ssize.reference.range(delta = 0.01, ref.prob = 0.95, conf.prob = 0.9, 
                      method = "parametric", exact = FALSE)
ssize.reference.range(delta = 0.01, ref.prob = 0.95, conf.prob = 0.9, 
                      method = "nonparametric", exact = TRUE)
ssize.reference.range(delta = 0.01, ref.prob = 0.95, conf.prob = 0.9, 
                      method = "nonparametric", exact = FALSE)

We can also calculate the sample size for one-sided reference ranges.

ssize.reference.range(delta = 0.015, ref.prob = 0.95, conf.prob = 0.9, 
                      method = "parametric", exact = TRUE, 
                      alternative = "one.sided")

Multiple Primary Endpoints (MPE)

We demonstrate how to calculate the sample size for a trial with two co-primary endpoints with known covariance.

Sigma <- matrix(c(1, 0.8, 0.8, 1), nrow = 2)
power.mpe.known.var(K = 2, delta = c(0.25, 0.4), Sigma = Sigma, 
                    sig.level = 0.025, power = 0.8)
## equivalent: specify SDs and correlation rho
power.mpe.known.var(K = 2, delta = c(0.25, 0.4), SD = c(1,1), rho = 0.8, 
                    sig.level = 0.025, power = 0.8)

Next, we show how to calculate the sample size for a trial with two co-primary endpoints with unknown covariance. Here, we follow three steps to determine the sample size.

## Step 1:
power.mpe.known.var(K = 2, delta = c(0.5, 0.4), Sigma = Sigma, 
                    sig.level = 0.025, power = 0.8)
## Step 2 + 3:
Sigma <- matrix(c(1, 0.5, 0.5, 1), nrow = 2)
power.mpe.unknown.var(K = 2, delta = c(0.5, 0.4), Sigma = Sigma, 
                      sig.level = 0.025, power = 0.8, 
                      n.min = 105, n.max = 120)
## equivalent: specify SDs and correlation rho
power.mpe.unknown.var(K = 2, delta = c(0.5, 0.4), SD = c(1,1), rho = 0.5, 
                      sig.level = 0.025, power = 0.8, 
                      n.min = 105, n.max = 120)

We finally demonstrate how to calculate the sample size for a trial with two primary endpoints with known covariance, where the trial is designed to find a significant difference for at least one endpoint.

Sigma <- matrix(c(1, 0.3, 0.3, 1), nrow = 2)
power.mpe.atleast.one(K = 2, delta = c(0.2, 0.3), Sigma = Sigma,
                      power = 0.8, sig.level = 0.025)
## equivalent: specify SDs and correlation rho
power.mpe.atleast.one(K = 2, delta = c(0.2, 0.3), SD = c(1,1), rho = 0.3,  
                      power = 0.8, sig.level = 0.025)

Power and Type-I-Error Simulations

There are quite some discussions and various proposals, which test is the best under which circumstances when we want to compare the location (mean, median) of two groups [@Fagerland2009; @Fagerland2012; @Sezer2017]. In addition, it is questioned whether pre-testing of assumptions is appropriate/useful at least from a practical point of view [@Zimmerman2004; @Rasch2011; @Rochon2012; @Hoekstra2012].

We provide functions to simulate power and type-I-error of classical [@student1908], Welch [@Welch1947] and Hsu [@Hsu1938] t-tests as well as of Wilcoxon-Mann-Whitney tests [@Wilcoxon1945; @Mann1947].

## functions to simulate the data
## group 1
rx <- function(n) rnorm(n, mean = 0, sd = 1)
rx.H0 <- rx
## group 2
ry <- function(n) rnorm(n, mean = 3, sd = 3)
ry.H0 <- function(n) rnorm(n, mean = 0, sd = 3)
## theoretical results
power.welch.t.test(n = 10, delta = 3, sd1 = 1, sd2 = 3)
power.hsu.t.test(n = 10, delta = 3, sd1 = 1, sd2 = 3)
## simulation
res.t <- sim.power.t.test(nx = 10, rx = rx, rx.H0 = rx.H0,
                          ny = 10, ry = ry, ry.H0 = ry.H0)
res.t
res.w <- sim.power.wilcox.test(nx = 10, rx = rx, rx.H0 = rx.H0,
                               ny = 10, ry = ry, ry.H0 = ry.H0)
res.w

For further investigation of the results, we provide some diagnostic plots.

## t-tests
hist(res.t)
qqunif(res.t, alpha = 0.05)
volcano(res.t, hex = TRUE)
##  Wilcoxon-Mann-Whitney tests
hist(res.w)
qqunif(res.w, alpha = 0.05)

We can also generate a volcano plot for the Wilcoxon-Mann-Whitney test, but this would require setting argument conf.int to TRUE, which would lead to a much higher computation time, hence we omitted it here. Furthermore, it is also possible to compute an approximate version of the test by setting argument approximate to TRUE [@Hothorn2008] again by the cost of an increased computation time.

sessionInfo

sessionInfo()

References



stamats/MKpower documentation built on April 10, 2024, 3:34 p.m.