findn: Find the Sample Size for a trial based repeated simulation...
In findn: Simulation Based Sample Size Estimation

View source: R/findn.R

findn

R Documentation

Find the Sample Size for a trial based repeated simulation using a model based approach

Description

findn estimates the sample size to achieve a pre-defined power, when the power can only be evaluated using simulations. findn uses a model-based approach for this purpose.

Usage

findn(
  fun,
  targ,
  start,
  k = 25,
  init_evals = 100,
  r = 4,
  stop = c("evals", "power_ci", "abs_unc", "rel_unc"),
  max_evals = 2000,
  level = 0.05,
  power_ci_tol = 0.02,
  abs_unc_tol = 10,
  rel_unc_tol = 0.1,
  var_a = 1,
  var_b = 1,
  alpha = 0.05,
  alternative = c("two.sided", "one.sided"),
  min_x = 2,
  verbose = FALSE,
  ...
)

Arguments

`fun`	A function that estimates the power of a trial. The function has to take at least two arguments: n, the sample size and k, the number of iterations.
`targ`	The target power.
`start`	An initial guess for the sample size.
`k`	Number of trial simulations to use in `fun` to estimate the power.
`init_evals`	How many evaluations the first model is based on.
`r`	A multiplicator for the range of the initial design points.
`stop`	The stopping criterion. One of `"evals"`, `"power_ci"`, `"abs_unc"`, `"rel_unc"`.
`max_evals`	The maximum number of simulations.
`level`	Significance level for the confidence intervals if `stop` is something other than `"evals"`. Also used to determine the levels for the confidence intervals that are printed if `verbose = TRUE`.
`power_ci_tol`	Tolerance parameter if `stop = "power_ci"`.
`abs_unc_tol`	Tolerance parameter if `stop = "abs_unc"`.
`rel_unc_tol`	Tolerance parameter if `stop is "rel_unc"`.
`var_a`	Variance of the prior distribution for the intercept.
`var_b`	Variance of the prior distribution for the slope.
`alpha`	The significance level of the underlying test. This is used to compute the mean of the prior distribution of the intercept.
`alternative`	Either "two.sided" or "one.sided". This is only used to determine the mean of the intercept prior.
`min_x`	The minimum sample size that `fun` can be evaluated for.
`verbose`	If `TRUE`, the current sample size estimate, the predicted power and its `level` percent confidence is returned after every iteration.
`...`	Further optional arguments.

Details

findn estimates the sample size for a target function that returns a simulated power value for a test or a trial. The target function must have at least two arguments, n, the sample size for which the trial is simulated, and k, that specifies how often the trial is simulated. Note that depending on how fun is written, n can either be the sample size per group or the total sample size. The function has to return an estimate for the power of the trial for the sample size n based on k Monte Carlo simulations.

findn uses an algorithm that assumes a probit model and computes Bayesian parameter estimates. The mean of the prior distribution of the intercept is computed from the significance level alpha of the underlying test and the alternative. The mean of the prior distribution of the slope is computed from the initial guess for the sample size - start. The variances of the prior distributions can be adjusted using the arguments var_a and var_b.

There are four different stopping criteria. When stop = "evals" the algorithm stops when the target function was evaluated max_evals times. When stop = "power_ci"the algorithm stops when the level percent confidence interval of the predicted power at the current sample size estimate is within the interval targ plus and minus power_ci_tol. When stop = "abs_unc" the algorithm stops when the number of sample sizes in the uncertainty set smaller than abs_unc_tol. The uncertainty set is defined as the set that contains all sample sizes for which the level percent confidence interval for the predicted power contains targ. When stop = "rel_unc" the algorithm stops when the relative uncertainty range is smaller than rel_unc_tol. The relative uncertainty range is defined as the greatest integer in the uncertainty set minus the smallest integer in the uncertainty set, divided by the smallest number in the uncertainty set. The algorithm also stops when stop is either "power_ci", "abs_unc" or "rel_unc" and the stopping criterion couldn't be satisfied within max_evals evaluations.

Value

findn returns an object of class findn. By default, a list containing the point estimate for the sample size, the minimum sufficient sample size (i.e. the smallest sample size for which the lower limit of the confidence interval for the estimated power is larger than the target power) and a message whether the stopping criterion was reached is printed. See print.findn for details.

Examples

# Function that simulates the outcomes of a two-sample t-test
ttest <- function(n, k, mu1 = 0, mu2 = 1, sd = 2) {
  sample1 <- matrix(rnorm(n = ceiling(n) * k, mean = mu1, sd = sd),
    ncol = k)
  mean1 <- apply(sample1, 2, mean)
  sd1_hat <- apply(sample1, 2, sd)
  sample2 <- matrix(rnorm(n = ceiling(n) * k, mean = mu2, sd = sd),
    ncol = k)
  mean2 <- apply(sample2, 2, mean)
  sd2_hat <- apply(sample2, 2, sd)
  sd_hat <- sqrt((sd1_hat^2 + sd2_hat^2) / 2)
  teststatistic <- (mean1 - mean2) / (sd_hat * sqrt(2 / n))
  crit <- qt(1 - 0.025, 2 * n - 2)
  return(mean(teststatistic < -crit))
}

findn(fun = ttest, targ = 0.8, k = 25, start = 100, 
  init_evals = 100, r = 4, stop = "evals", max_evals = 2000, 
  level = 0.05, var_a = 1, var_b = 0.1, alpha = 0.025, 
  alternative = "one.sided", verbose = FALSE)

findn documentation built on March 30, 2026, 9:07 a.m.