calculate_power: calculate_power
In gladstone-institutes/CalcPower: Power Calculation for biological experiments

View source: R/calculate_power.R

calculate_power

R Documentation

calculate_power

Description

This function uses simulation to perform power analysis. It is designed to explore the power of biological experiments and to suggest an optimal number of experimental variables with reasonable power. The backbone of the function is based on simr package, which fits a fixed effect or mixed effect model based on the observed data and simulates response variables. Users can test the power of different combinations of experimental variables and parameters.

Note: The current version does not accept categorical response variables, sample size parameters smaller than the observed samples size

Usage

calculate_power(
  data,
  condition_column,
  experimental_columns,
  response_column,
  target_columns,
  power_curve,
  condition_is_categorical,
  repeatable_columns = NA,
  response_is_categorical = FALSE,
  nsimn = 1000,
  family = NULL,
  levels = NULL,
  max_size = NULL,
  breaks = NULL,
  effect_size = NULL,
  ICC = NULL,
  output = NULL
)

Arguments

`data`	Input data
`condition_column`	Name of the condition variable (ex variable with values such as control/case). The input file has to have a corresponding column name
`experimental_columns`	Name of variables related to experimental design such as "experiment", "plate", and "cell_line". "experiment" should come always first
`response_column`	Name of the variable observed by performing the experiment. ex) intensity.
`target_columns`	Name of the experimental parameters to use for the power calculation.
`power_curve`	1: Power simulation over a range of sample sizes or levels. 0: Power calculation over a single sample size or a level.
`condition_is_categorical`	Specify whether the condition variable is categorical. TRUE: Categorical, FALSE: Continuous.
`repeatable_columns`	Name of experimental variables that may appear repeatedly with the same ID. For example, cell_line C1 may appear in multiple experiments, but plate P1 cannot appear in more than one experiment
`response_is_categorical`	Default: the observed variable is continuous TRUE: Categorical , FALSE: Continuous (default).
`family`	The type of distribution family to specify when the response is categorical. If family is "binary" then binary(link="log") is used, if family is "poisson" then poisson(link="logit") is used, if family is "poisson_log" then poisson(link=") log") is used.
`levels`	1: Amplify the number of corresponding target parameter. 0: Amplify the number of samples from the corresponding target parameter, ex) If target_columns = c("experiment","cell_line") and if you want to expand the number of experiment and sample more cells from each cell line, levels = c(1,0).
`max_size`	Maximum levels or sample sizes to test. Default: the current level or the current sample size x 5. ex) If max_levels = c(10,5), it will test upto 10 experiments and 5 cell lines.
`breaks`	Levels /sample sizes of the variable to be specified along the power curve.. Default: max(1, round( the number of current levels / 5 ))
`effect_size`	If you know the effect size of your condition variable, provided it. If the effect size is not provided, it will be estimated from your data
`ICC`	Intra-Class Coefficients (ICC) for each parameter
`output`	Output file name
`nsim`	The number of simulations to run. Default=1000