findn: Find the Sample Size for a trial based repeated simulation...

View source: R/findn.R

findnR Documentation

Find the Sample Size for a trial based repeated simulation using a model based approach

Description

findn estimates the sample size to achieve a pre-defined power, when the power can only be evaluated using simulations. findn uses a model-based approach for this purpose.

Usage

findn(
  fun,
  targ,
  start,
  k = 25,
  init_evals = 100,
  r = 4,
  stop = c("evals", "power_ci", "abs_unc", "rel_unc"),
  max_evals = 2000,
  level = 0.05,
  power_ci_tol = 0.02,
  abs_unc_tol = 10,
  rel_unc_tol = 0.1,
  var_a = 1,
  var_b = 1,
  alpha = 0.05,
  alternative = c("two.sided", "one.sided"),
  min_x = 2,
  verbose = FALSE,
  ...
)

Arguments

fun

A function that estimates the power of a trial. The function has to take at least two arguments: n, the sample size and k, the number of iterations.

targ

The target power.

start

An initial guess for the sample size.

k

Number of trial simulations to use in fun to estimate the power.

init_evals

How many evaluations the first model is based on.

r

A multiplicator for the range of the initial design points.

stop

The stopping criterion. One of "evals", "power_ci", "abs_unc", "rel_unc".

max_evals

The maximum number of simulations.

level

Significance level for the confidence intervals if stop is something other than "evals". Also used to determine the levels for the confidence intervals that are printed if verbose = TRUE.

power_ci_tol

Tolerance parameter if stop = "power_ci".

abs_unc_tol

Tolerance parameter if stop = "abs_unc".

rel_unc_tol

Tolerance parameter if stop is "rel_unc".

var_a

Variance of the prior distribution for the intercept.

var_b

Variance of the prior distribution for the slope.

alpha

The significance level of the underlying test. This is used to compute the mean of the prior distribution of the intercept.

alternative

Either "two.sided" or "one.sided". This is only used to determine the mean of the intercept prior.

min_x

The minimum sample size that fun can be evaluated for.

verbose

If TRUE, the current sample size estimate, the predicted power and its level percent confidence is returned after every iteration.

...

Further optional arguments.

Details

findn estimates the sample size for a target function that returns a simulated power value for a test or a trial. The target function must have at least two arguments, n, the sample size for which the trial is simulated, and k, that specifies how often the trial is simulated. Note that depending on how fun is written, n can either be the sample size per group or the total sample size. The function has to return an estimate for the power of the trial for the sample size n based on k Monte Carlo simulations.

findn uses an algorithm that assumes a probit model and computes Bayesian parameter estimates. The mean of the prior distribution of the intercept is computed from the significance level alpha of the underlying test and the alternative. The mean of the prior distribution of the slope is computed from the initial guess for the sample size - start. The variances of the prior distributions can be adjusted using the arguments var_a and var_b.

There are four different stopping criteria. When stop = "evals" the algorithm stops when the target function was evaluated max_evals times. When stop = "power_ci"the algorithm stops when the level percent confidence interval of the predicted power at the current sample size estimate is within the interval targ plus and minus power_ci_tol. When stop = "abs_unc" the algorithm stops when the number of sample sizes in the uncertainty set smaller than abs_unc_tol. The uncertainty set is defined as the set that contains all sample sizes for which the level percent confidence interval for the predicted power contains targ. When stop = "rel_unc" the algorithm stops when the relative uncertainty range is smaller than rel_unc_tol. The relative uncertainty range is defined as the greatest integer in the uncertainty set minus the smallest integer in the uncertainty set, divided by the smallest number in the uncertainty set. The algorithm also stops when stop is either "power_ci", "abs_unc" or "rel_unc" and the stopping criterion couldn't be satisfied within max_evals evaluations.

Value

findn returns an object of class findn. By default, a list containing the point estimate for the sample size, the minimum sufficient sample size (i.e. the smallest sample size for which the lower limit of the confidence interval for the estimated power is larger than the target power) and a message whether the stopping criterion was reached is printed. See print.findn for details.

Examples

# Function that simulates the outcomes of a two-sample t-test
ttest <- function(n, k, mu1 = 0, mu2 = 1, sd = 2) {
  sample1 <- matrix(rnorm(n = ceiling(n) * k, mean = mu1, sd = sd),
    ncol = k)
  mean1 <- apply(sample1, 2, mean)
  sd1_hat <- apply(sample1, 2, sd)
  sample2 <- matrix(rnorm(n = ceiling(n) * k, mean = mu2, sd = sd),
    ncol = k)
  mean2 <- apply(sample2, 2, mean)
  sd2_hat <- apply(sample2, 2, sd)
  sd_hat <- sqrt((sd1_hat^2 + sd2_hat^2) / 2)
  teststatistic <- (mean1 - mean2) / (sd_hat * sqrt(2 / n))
  crit <- qt(1 - 0.025, 2 * n - 2)
  return(mean(teststatistic < -crit))
}

findn(fun = ttest, targ = 0.8, k = 25, start = 100, 
  init_evals = 100, r = 4, stop = "evals", max_evals = 2000, 
  level = 0.05, var_a = 1, var_b = 0.1, alpha = 0.025, 
  alternative = "one.sided", verbose = FALSE)

findn documentation built on March 30, 2026, 9:07 a.m.