powersim: Provide power estimates for multivariate abundance models

View source: R/s3_methods.R

powersim.cordR Documentation

Provide power estimates for multivariate abundance models

Description

powersim returns a power estimate for a cord object for a given sample size N and effect size of interest.

Usage

## S3 method for class 'cord'
powersim(
  object,
  coeffs,
  term,
  N = nrow(object$obj$data),
  coeffs0 = effect_null(object$obj, term),
  nsim = 1000,
  ncrit = 999,
  test = "score",
  alpha = 0.05,
  newdata = NULL,
  ncores = detectCores() - 1,
  show.time = TRUE,
  long_power = FALSE,
  n.samp = 10,
  nlv = 2
)

powersim(
  object,
  coeffs,
  term,
  N = nrow(object$obj$data),
  coeffs0 = effect_null(object$obj, term),
  nsim = 999,
  ncrit = nsim,
  test = "score",
  alpha = 0.05,
  newdata = NULL,
  ncores = detectCores() - 1,
  show.time = TRUE,
  long_power = FALSE,
  n.samp = 10,
  nlv = 2
)

Arguments

object

objects of class cord, typically the result of a call to cord.

coeffs

Coefficient matrix for a manyglm object that characterises the size of effects to be simulated. See effect_alt for help in producing this matrix.

term

Name of predictor of interest in quotes.

N

Number of samples for power estimate. Defaults to the number of observations in the original sample.

coeffs0

Coefficient matrix under the null hypothesis. Defaults to being specified by effect_null.

nsim

Number of simulated test statistics under the specified effect size (coeffs) to estimate power. Defaults to 999.

ncrit

Number of simulated test statistics under the null effect to estimate the critical value. Defaults to 999.

test

Test statistic for power estimate to based upon. Defaults to "score", however "wald" is also allowed.

alpha

Type I error rate for power estimate, defaults to 0.05.

newdata

Data frame of the same size as the original data frame from the cord object (object$obj$data), that specifies a different design of interest.

ncores

Number of cores for parallel computing. Defaults to the total number of cores available on the machine minus 1.

show.time

Logical. Displays time elapsed. Defaults to TRUE.

long_power

Logical. Whether to estimate power using separate critical test statistics for each nsim test statistics simulated under the alternative hypothesis. Note that although this will give a more accurate estimate of power, it will take a considerably large amount of time. First try increasing ncrit. Defaults to FALSE.

n.samp

integer, number of sets of residuals for importance sampling for the copula model with cord. Defaults to 10, recommend setting this higher for smaller sample sizes N.

nlv

number of latent variables (default = 2) for the copula model with cord, recommend setting this lower for smaller sample sizes N.

Details

powersim takes a cord object, sample size N and coefficient matrix coeffs which specifies an effect size of interest and returns a power estimate.

The power estimate is obtained by first parsing the cord object into extend, nsim times with an effect size specified by coeffs. Next, the cord object is parsed into extend an additional ncrit times with a null effect, which is defined by default by effect_null. This effectively simulates nsim + ncrit manyglm models under both the null and alternative hypothesis.

For each simulated manyglm object, a test statistic test is obtained. A critical test statistic is then obtained as the upper 1 - alpha quantile of simulated test statistics under the null hypothesis. Power is then estimated as the proportion of times the test statistics simulated under the alternative hypothesis exceed the critical test statistic under the null.

To improve computation time, simulations are computed in parallel using the "socket" approach, which by default uses all available cores minus 1 for clustering. Using 1 less than the number of available cores for your machine (detectCores()-1) is recommended to better avoid errors relating to clustering or nodes.

Value

Power estimate result, and;

power

power.

Functions

  • powersim(): Provide power estimates for multivariate abundance models

Author(s)

Ben Maslen <b.maslen@unsw.edu.au>.

References

Maslen, B., Popovic, G., Lim, M., Marzinelli, E., & Warton, D. (2023). How many sites? Methods to assist design decisions when collecting multivariate data in ecology. Methods in Ecology and Evolution, 14(6), 1564-1573. Popovic, G. C., Hui, F. K., & Warton, D. I. (2018). A general algorithm for covariance modeling of discrete data. Journal of Multivariate Analysis, 165, 86-100.

See Also

effect_alt, effect_null, extend

Examples


library(ecoCopula)
library(mvabund)
data(spider)
spiddat = mvabund(spider$abund)
X = data.frame(spider$x)

# Specify increasers and decreasers
increasers = c("Alopacce", "Arctlute", "Arctperi", "Pardnigr", "Pardpull")
decreasers = c("Alopcune", "Alopfabr", "Zoraspin")

# Find power for continuous predictor at effect_size=1.5
fit.glm = manyglm(spiddat~bare.sand, family="negative.binomial", data=X)
effect_mat = effect_alt(fit.glm, effect_size=1.5,
       increasers, decreasers, term="bare.sand")
fit.cord = cord(fit.glm)
powersim(fit.cord, coeffs=effect_mat, term="bare.sand", nsim=99, ncrit=99, ncores=2)

# Find power for categorical predictor with 4 levels at effect_size=1.5
X$Treatment = rep(c("A","B","C","D"),each=7)
fit_factors.glm = manyglm(spiddat~Treatment, family="negative.binomial", data=X)
effect_mat = effect_alt(fit_factors.glm, effect_size=1.5,
       increasers, decreasers, term="Treatment")
fit_factors.cord = cord(fit_factors.glm)
powersim(fit_factors.cord, coeffs=effect_mat, term="Treatment", nsim=99, ncrit=99, ncores=2)

# Change effect size parameterisation
effect_mat = effect_alt(fit_factors.glm, effect_size=1.5,
                         increasers, decreasers, term="Treatment",
                         K=c(3,1,2))
powersim(fit_factors.cord, coeffs=effect_mat, term="Treatment", nsim=99, ncrit=99, ncores=2)


ecopower documentation built on Aug. 24, 2023, 5:09 p.m.