A Practical Guide to Statistical Power and Sample Size Calculations in R"

knitr::opts_chunk$set(echo = TRUE)
library(pwrss)

:::: {style="display: flex;"}

::: {} Install and load pwrss R package:

install.packages("pwrss")
library(pwrss)


{width=65%} :::

::: {}

 

:::

::: {} Alternatively, use the web app below:

:::

::::

If you find the package and related material useful please cite as:

Bulus, M. (2023). pwrss: Statistical Power and Sample Size Calculation Tools. R package version 0.3.1. [https://CRAN.R-project.org/package=pwrss](https://CRAN.R-project.org/package=pwrss) Bulus, M., & Polat, C. (in press). pwrss R paketi ile istatistiksel guc analizi [Statistical power analysis with pwrss R package]. Ahi Evran Universitesi Kirsehir Egitim Fakultesi Dergisi. [https://osf.io/ua5fc/download/](https://osf.io/ua5fc/download/)

Acknowledgements

I would like to thank Roland Thijs and Fred Oswald, who caught some typos in the earlier version of this vignette.

Please send any bug reports, feedback, or questions to bulusmetin [at] gmail.com



Generic Functions

These generic functions calculate and return statistical power along with optional Type I and Type II error plots (as long as the test statistic and degrees of freedom is known). These functions are handy because z, t, $\chi^{2}$, and F statistics and associated degrees of freedom can be readily found in reports, published articles, or standard software output (such as SAS, Stata, SPSS, Jamovi, etc.). Statistical power can be calculated via putting the test statistics for ncp in the functions below; however, do not overinterpret post-hoc power rates.

t Test

power.t.test(ncp = 1.96, df = 99, alpha = 0.05,
             alternative = "equivalent", plot = TRUE)

z Test

power.z.test(ncp = 1.96, alpha = 0.05, 
             alternative = "not equal", plot = TRUE)

$\chi^2$ Test

power.chisq.test(ncp = 15, df = 20,
                 alpha = 0.05, plot = TRUE)

F Test

power.f.test(ncp = 3, df1 = 2, df2 = 98,
             alpha = 0.05, plot = TRUE)

Multiple Parameters

Multiple parameters are allowed but plots should be turned off (plot = FALSE).

power.t.test(ncp = c(0.50, 1.00, 1.50, 2.00, 2.50), plot = FALSE,
             df = 99, alpha = 0.05, alternative = "not equal")

power.z.test(alpha = c(0.001, 0.010, 0.025, 0.050), plot = FALSE,
             ncp = 1.96, alternative = "greater")

power.chisq.test(df = c(80, 90, 100, 120, 150, 200), plot = FALSE, 
                 ncp = 2, alpha = 0.05)

Type I and Type II Error Plots

plot() function (S3 method) is a wrapper around the generic functions above. Assign results of any pwrss function to an R object and pass it to plot() function.

# comparing two means
design1 <- pwrss.t.2means(mu1 = 0.20, margin = -0.05, paired = TRUE,
                          power = 0.80, alpha = 0.05,
                          alternative = "non-inferior")
plot(design1)

# ANCOVA design
design2 <- pwrss.f.ancova(eta2 = 0.10, n.levels = c(2,3),
                          power = .80, alpha = 0.05)
plot(design2)

# indirect effect in mediation analysis
design3 <- pwrss.z.med(a = 0.10, b = 0.20, cp = 0.10,
                       power = .80, alpha = 0.05)
plot(design3)



Mean Difference (t Tests)

Parametric Tests

Independent Samples t Test

More often than not, unstandardized means and standard deviations are reported in publications for descriptive purposes. Another reason is that they are more intuitive and interpretable (e.g. depression scale). Assume that for the first and second groups expected means are 30 and 28, and expected standard deviations are 12 and 8, respectively.

Calculate Statistical Power

What is the statistical power given that the sample size for the second group is 50 (n2 = 50) and groups have equal sample sizes (kappa = n1 / n2 = 1)?

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, kappa = 1, 
               n2 = 50, alpha = 0.05,
               alternative = "not equal")

Calculate Minimum Required Sample Size

What is the minimum required sample size given that groups have equal sample sizes (kappa = 1)? ($\kappa = n_1 / n_2$)

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, kappa = 1, 
               power = .80, alpha = 0.05,
               alternative = "not equal")

It is sufficient to put pooled standard deviation for sd1 because sd2 = sd1 by default. In this case, for a pooled standard deviation of 10.198 the minimum required sample size can be calculated as

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 10.198, kappa = 1,
               power = .80, alpha = 0.05,
               alternative = "not equal")

It is sufficient to put Cohen's d or Hedge's g (standardized difference between two groups) for mu1 because mu2 = 0, sd1 = 1, and sd2 = sd1 by default. For example, for an effect size as small as 0.196 (based on previous example) the minimum required sample size can be calculated as

pwrss.t.2means(mu1 = 0.196, kappa = 1,
               power = .80, alpha = 0.05, 
               alternative = "not equal")

Paired Samples t Test

Assume for the first (e.g. pretest) and second (e.g. posttest) time points expected means are 30 and 28 (a reduction of 2 points), and expected standard deviations are 12 and 8, respectively. Also assume a correlation of 0.50 between first and second measurements (by default paired.r = 0.50). What is the minimum required sample size?

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               paired = TRUE, paired.r = 0.50,
               power = .80, alpha = 0.05,
               alternative = "not equal")

It is sufficient to put standard deviation of the difference for sd1 because sd2 = sd1 by default. In this case, for a standard deviation of difference of 10.583 the minimum required sample size can be calculated as

pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 10.583, 
               paired = TRUE, paired.r = 0.50,
               power = .80, alpha = 0.05,
               alternative = "not equal")

It is sufficient to put Cohen's d or Hedge's g (standardized difference between two time points) for mu1 because mu2 = 0, sd1 = sqrt(1/(2*(1-paired.r))), and sd2 = sd1 by default. For example, for an effect size as small as 0.1883 (based on previous example) the minimum required sample size can be calculated as

pwrss.t.2means(mu1 = 0.1883, paired = TRUE, paired.r = 0.50,
               power = .80, alpha = 0.05, 
               alternative = "not equal")

Non-parametric Tests

The variable of interest for which means are compared may not follow normal distribution. One reason is that the sample size could be small. Another reason is that the parent distribution (distribution in the population) may not be normal (e.g., could be uniform, exponential, double exponential, or logistic). In these cases t test could produce biased results.

For non-parametric tests use pwrss.np.2groups() function instead of pwrss.t.2means(). All arguments are the same. Additionally, however, the parent distribution can be specified as "normal", "uniform", "exponential", "double exponential", or "logistic". In the following examples the parent distribution is "normal" (not shown because it is the default) and can be changed with the dist or distribution argument (e.g. dist = "uniform").

Although means and standard deviations are some of the arguments in the function below, what is actually being tested is the difference between mean ranks. First, the mean difference is converted into Cohen's d, and then into probability of superiority, which is the probability of an observation in group 1 being higher than an observation in group 2. This parameterization, expressed as means and standard deviations, helps in making comparisons and switching back and forth between parametric and non-parametric tests.

Independent Samples (Wilcoxon-Mann-Whitney Test)

The example below uses the same parameters as the example in the independent t test section.

pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, kappa = 1, 
               power = .80, alpha = 0.05,
               alternative = "not equal")

Paired Samples (Wilcoxon Signed-rank Test)

The example below uses the same parameters as the example in the paired (dependent) t test section.

pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               paired = TRUE, paired.r = 0.50,
               power = .80, alpha = 0.05,
               alternative = "not equal")

It is sufficient to put Cohen's d or Hedge's g (standardized difference between two groups or measurements) for mu1 without specifying mu2, sd1, and sd2.

Non-inferiority, Superiority, and Equivalence

These tests are useful for testing practically significant difference (non-inferiority/superiority) or practically null difference (equivalence).

Parametric Tests

Non-inferiority: The mean of group 1 is practically not smaller than the mean of group 2. The mu1 - mu2 difference can be as small as -1 (margin = -1) but it will still be considered non-inferior. What is the minimum required sample size?

When higher values of an outcome is better the margin takes NEGATIVE values; whereas when lower values of the outcome is better margin takes POSITIVE values.
pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               margin = -1, power = 0.80,
               alternative = "non-inferior")

Superiority: The mean of group 1 is practically greater than the mean of group 2. The mu1 - mu2 difference is at least greater than 1 (margin = 1). What is the minimum required sample size?

When higher values of an outcome is better margin takes POSITIVE values; whereas when lower values of the outcome is better margin takes NEGATIVE values.
pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               margin = 1, power = 0.80,
               alternative = "superior")

Equivalence: The mean of group 1 is practically same as mean of group 2. The mu1 - mu2 difference can be as small as -1 and as high as 1 (margin = 1). What is the minimum required sample size?

Specify the absolute value for the margin.
pwrss.t.2means(mu1 = 30, mu2 = 30, sd1 = 12, sd2 = 8, 
               margin = 1, power = 0.80,
               alternative = "equivalent")

Non-parametric Tests

Non-inferiority: The mean of group 1 is practically not smaller than the mean of group 2. The mu1 - mu2 difference can be as small as -1 (margin = -1). What is the minimum required sample size?

pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               margin = -1, power = 0.80,
               alternative = "non-inferior")

Superiority: The mean of group 1 is practically greater than the mean of group 2. The mu1 - mu2 difference is at least greater than 1 (margin = 1). What is the minimum required sample size?

pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, 
               margin = 1, power = 0.80,
               alternative = "superior")

Equivalence: The mean of group 1 is practically same as mean of group 2. The mu1 - mu2 difference can be as small as -1 and as high as 1 (margin = 1). What is the minimum required sample size?

pwrss.np.2groups(mu1 = 30, mu2 = 30, sd1 = 12, sd2 = 8, 
               margin = 1, power = 0.80,
               alternative = "equivalent")



Linear Regression (F and t Tests)

Omnibus $F$ Test

$R^2 > 0$ in Linear Regression

Omnibus F test in multiple liner regression is used to test whether $R^2$ is greater than 0 (zero). Assume that we want to predict a continuous variable $Y$ using $X_{1}$, $X_{2}$, and $X_{2}$ variables (a combination of binary or continuous).

$$\begin{eqnarray} Y &=& \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2} + \beta_{3}X_{3} + r, \quad r \thicksim N(0,\sigma^2) \newline \end{eqnarray}$$

We are expecting that these three variables explain 30% of the variance in the outcome ($R^2 = 0.30$ or r2 = 0.30 in the code). What is the minimum required sample size?

pwrss.f.reg(r2 = 0.30, k = 3, power = 0.80, alpha = 0.05)

$\Delta R^2 > 0$ in Hierarchical Regression

Assume that we want to add two more predictors to the earlier regression model ($X_{4}$, and $X_{5}$; thus m = 2) and test whether increase in $R^2$ is greater than 0 (zero). In total there are five predictors in the model (k = 5). By adding two predictors we are expecting an increase of $\Delta R^2 = 0.15$. What is the minimum required sample size?

$$\begin{eqnarray} Y &=& \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2} + \beta_{3}X_{3} + \beta_{4}X_{4} + \beta_{5}X_{5} + r, \quad r \thicksim N(0,\sigma^2) \newline \end{eqnarray}$$

pwrss.f.reg(r2 = 0.15, k = 5, m = 2, power = 0.80, alpha = 0.05)

Single Regression Coefficient (z or t Test)

Standardized versus Unstandardized Input

In the earlier example, assume that we want to predict a continuous variable $Y$ using a continuous predictor $X_{1}$ but control for $X_{2}$, and $X_{2}$ variables (a combination of binary or continuous). We are mainly interested in the effect of $X_{1}$ and expect a standardized regression coefficient of $\beta_{1} = 0.20$.

$$\begin{eqnarray} Y &=& \beta_{0} + \color{red} {\beta_{1} X_{1}} + \beta_{2}X_{2} + \beta_{3}X_{3} + r, \quad r \thicksim N(0,\sigma^2) \newline \end{eqnarray}$$

Again, we are expecting that these three variables explain 30% of the variance in the outcome ($R^2 = 0.30$). What is the minimum required sample size? It is sufficient to provide standardized regression coefficient for beta1 because sdx = 1 and sdy = 1 by default.

pwrss.t.reg(beta1 = 0.20, k = 3, r2 = 0.30, 
            power = .80, alpha = 0.05, alternative = "not equal")

For unstandardized coefficients specify sdy and sdx. Assume we are expecting an unstandardized regression coefficient of beta1 = 0.60, a standard deviation of sdy = 12 for the outcome and a standard deviation of sdx = 4 for the main predictor. What is the minimum required sample size?

pwrss.t.reg(beta1 = 0.60, sdy = 12, sdx = 4, k = 3, r2 = 0.30, 
            power = .80, alpha = 0.05, alternative = "not equal")

If the main predictor is binary (e.g. treatment/control), the standardized regression coefficient is Cohen's d. Standard deviation of the main predictor is $\sqrt{p(1-p)}$ where p is the proportion of sample in one of the groups. Assume half of the sample is in the first group $p = 0.50$. What is the minimum required sample size? It is sufficient to provide Cohen's d for beta1 (standardized difference between two groups) but specify sdx = sqrt(p*(1-p)) where p is the proportion of subjects in one of the groups.

p <- 0.50
pwrss.t.reg(beta1 = 0.20, k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
            power = .80, alpha = 0.05, alternative = "not equal")

Non-inferiority, Superiority, and Equivalence

These tests are useful for testing practically significant effects (non-inferiority/superiority) or practically null effects (equivalence).

Non-inferiority: The intervention is expected to be non-inferior to some earlier or other interventions. Assume that the effect of an earlier or some other intervention is beta0 = 0.10. The beta1 - beta0 is expected to be positive and should be at least -0.05 (margin = -0.05). What is the minimum required sample size?

This is the case when higher values of an outcome is better. When lower values of an outcome is better the `beta1 - beta0` difference is expected to be NEGATIVE and the `margin` takes POSITIVE values.
p <- 0.50
pwrss.t.reg(beta1 = 0.20, beta0 = 0.10, margin = -0.05, 
            k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
            power = .80, alpha = 0.05, alternative = "non-inferior")

Superiority: The intervention is expected to be superior to some earlier or other interventions. Assume that the effect of an earlier or some other intervention is beta0 = 0.10. The beta1 - beta0 is expected to be positive and should be at least 0.05 (margin = 0.05). What is the minimum required sample size?

This is the case when higher values of an outcome is better. When lower values of an outcome is better `beta1 - beta0` difference is expected to be NEGATIVE and the `margin` takes NEGATIVE values.
p <- 0.50
pwrss.t.reg(beta1 = 0.20, beta0 = 0.10, margin = 0.05, 
            k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
            power = .80, alpha = 0.05, alternative = "superior")

Equivalence: The intervention is expected to be equivalent to some earlier or other interventions. Assume the effect of an earlier or some other intervention is beta0 = 0.20. The beta1 - beta0 is expected to be within -0.05 and 0.05 (margin = 0.05). What is the minimum required sample size?

`margin` always takes positive values for equivalence. Specify the absolute value.
p <- 0.50
pwrss.t.reg(beta1 = 0.20, beta0 = 0.20, margin = 0.05, 
            k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
            power = .80, alpha = 0.05, alternative = "equivalent")



Logistic Regression (Wald's z Test)

In logistic regression a binary outcome variable (0/1: failed/passed, dead/alive, absent/present) is modeled by predicting probability of being in group 1 ($P_1$) via logit transformation (natural logarithm of odds). The base probability $P_0$ is the overall probability of being in group 1 without influence of predictors in the model (null). Under alternative hypothesis, the probability of being in group 1 ($P_1$) deviate from $P_0$ depending on the value of the predictor; whereas under null it is same as the $P_0$. A model with one main predictor ($X_1$) and two other covariates ($X_2$ and $X_3$) can be constructed as

$$\begin{eqnarray} ln(\frac{P_1}{1- P_1}) &=& \beta_{0} + \color{red} {\beta_{1} X_{1}} + \beta_{2}X_{2} + \beta_{3}X_{3} \newline \end{eqnarray}$$

where
$$\beta_0 = ln(\frac{P_0}{1- P_0})$$
$$\beta_1 = ln(\frac{P_1}{1- P_1} / \frac{P_0}{1- P_0})$$
Odds ratio is defined as $$OR = exp(\beta_1) = \frac{P_1}{1- P_1} / \frac{P_0}{1- P_0}$$

Example:

Assume

What is the minimum required sample size? There are three types of specification to statistical power or sample size calculations; (i) probability specification, (ii) odds ratio specification, and (iii) regression coefficient specification (as in standard software output).

Probability specification:

pwrss.z.logreg(p0 = 0.15, p1 = 0.10, r2.other.x = 0.20,
               power = 0.80, alpha = 0.05, 
               dist = "normal")

Odds ratio specification: $$OR = \frac{P_1}{1-P1} / \frac{P_0}{1-P_0} = \frac{0.10}{1-0.10} / \frac{0.15}{1-0.15} = 0.6296$$

pwrss.z.logreg(p0 = 0.15, odds.ratio = 0.6296, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = "normal")

Regression coefficient specification: $$\beta_1 = ln(\frac{P_1}{1-P1} / \frac{P_0}{1-P_0}) = ln(0.6296) = -0.4626$$

pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = "normal")

Change the distribution's parameters for predictor X:

The mean and standard deviation of a normally distributed main predictor is 0 and 1 by default. They can be modified. In the following example the mean is 25 and the standard deviation is 8.

dist.x <- list(dist = "normal", mean = 25, sd = 8)

pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = dist.x)

Change the distribution family of predictor X:

More distribution types are supported by the function. For example, the main predictor can be binary (e.g. treatment/control groups). Often half of the sample is assigned to the treatment group and the other half to the control (prob = 0.50 by default).

pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = "bernoulli")

Change the treatment group allocation rate of the binary predictor X (prob = 0.40):

The per-subject cost in a treatment group is sometimes substantially higher than the per-subject cost in control group. At other times, it is difficult to recruit subjects for a treatment whereas there are plenty of subjects to be considered for the control group (e.g. a long wait-list). In such cases there is some leeway to pick an unbalanced sample (but not too much unbalanced). Assume the treatment group allocation rate is 40%. What is the minimum required sample size?

dist.x <- list(dist = "bernoulli", prob = 0.40)

pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
               alpha = 0.05, power = 0.80,
               dist = dist.x)



Poisson Regression (Wald's z Test)

In Poisson regression a count outcome variable (e.g. number of hospital/store/website visits, number of absence/dead/purchase in a day/week/month) is modeled by predicting incidence rate ($\lambda$) via logarithmic transformation (natural logarithm of rates). A model with one main predictor ($X_1$) and two other covariates ($X_2$ and $X_3$) can be constructed as

$$\begin{eqnarray} ln(\lambda) &=& \beta_{0} + \color{red} {\beta_{1} X_{1}} + \beta_{2}X_{2} + \beta_{3}X_{3} \newline \end{eqnarray}$$

where $exp(\beta_0)$ is the base incidence rate.
$$\beta_1 = ln(\frac{\lambda(X_1=1)}{\lambda(X_1=0)})$$
Incidence rate ratio is defined as $$exp(\beta_1) = \frac{\lambda(X_1=1)}{\lambda(X_1=0)}$$

Assume

What is the minimum required sample size? There are two types of specification; (i) rate ratio specification (exponentiated regression coefficient), and (ii) raw regression coefficient specification (as in standard software output).

Regression coefficient specification:

pwrss.z.poisreg(beta0 = 0.50, beta1 = -0.10,
                power = 0.80, alpha = 0.05, 
                dist = "normal")

Rate ratio specification:

pwrss.z.poisreg(exp.beta0 = exp(0.50),
                exp.beta1 = exp(-0.10),
                power = 0.80, alpha = 0.05, 
                dist = "normal")

Change the distribution's parameters for predictor X:

dist.x <- list(dist = "normal", mean = 25, sd = 8)

pwrss.z.poisreg(exp.beta0 = exp(0.50),
                exp.beta1 = exp(-0.10),
                alpha = 0.05, power = 0.80,
                dist = dist.x)

The function accomodates other types of distribution. For example, the main predictor can be binary (e.g. treatment/control groups).

Regression coefficient specification:

pwrss.z.poisreg(beta0 = 0.50, beta1 = -0.10,
                alpha = 0.05, power = 0.80,
                dist = "bernoulli")

Rate ratio specification:

pwrss.z.poisreg(exp.beta0 = exp(0.50),
                exp.beta1 = exp(-0.10),
                alpha = 0.05, power = 0.80,
                dist = "bernoulli")

Change treatment group allocation rate of the binary predictor X (prob = 0.40):

dist.x <- list(dist = "bernoulli", prob = 0.40)

pwrss.z.poisreg(exp.beta0 = exp(0.50),
                exp.beta1 = exp(-0.10),
                alpha = 0.05, power = 0.80,
                dist = dist.x)



Indirect Effect (Sobel's z Test)

A simple mediation model can be constructed as in the figure.

**Simple Mediation Model**

Regression models take the form of

$$\begin{eqnarray} M &=& \beta_{0M} + \color{red} {a} X + e\newline Y &=& \beta_{0Y} + \color{red} {b} M + \color{red} {cp} X + \epsilon \newline \end{eqnarray}$$

Y is the outcome, M is the mediator, and X is the main predictor. The indirect effect is the product of a and b path coefficients. cp is the path coefficient for the direct effect. Path coefficients can be standardized or unstandardized. They are presumed to be standardized by default because the main predictor, mediator, and outcome all have standard deviations of sdx = 1, sdm = 1, and sdy = 1 in the function. Standard deviations should be specified for unstandardized path coefficients (you can find these values in descriptive tables in reports/publications).

X is continuous:

Most software applications presume no covariates in the mediator and outcome models ($R^2_M = 0$ and $R^2_M = 0$). Even with no covariates, X explain some of the variance in M ($R^2_M > 0$) and M & X explains some of the variance in Y ($R^2_Y > 0$). We will almost never have an R-squared value of 0 (zero). The explained variance in the basic mediation model (the base R-squared values) can be non-trivial and is taken into account in the function by default. Thus, results may seem different from other software outputs.

# mediation model with base R-squared values
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
            power = 0.80, alpha = 0.05)

To match results with other software packages, explicitly specify these parameters as 0 (zero) (r2m = 0 and r2y = 0). There will be some warnings indicating that the function expects R-squared values greater than the base R-squared value. Note that in this case cp argument will be ignored.

# base R-squared values are 0 (zero)
# do not specify 'cp' 
pwrss.z.med(a = 0.25, b = 0.25, 
            r2m = 0, r2y = 0,
            power = 0.80, alpha = 0.05)

X is binary:

The case where the main predictor is binary can be useful in practice. A researcher might be interested in whether treatment influence the outcome through some mediators.

p <- 0.50 # proportion of subjects in one of the groups
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10, 
            sdx = sqrt(p*(1-p)), 
            power = 0.80, alpha = 0.05)

Statistical power for joint and Monte Carlo interval tests

Joint and Monte Carlo tests are only available when power is requested.

# binary X
p <- 0.50 # proportion of subjects in one of the groups
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10, 
            sdx = sqrt(p*(1-p)), 
            n = 300, alpha = 0.05)

Explanatory Power of Covariates

Covariates can be added to the mediator model, outcome model, or both. Explanatory power of the covariates (R-squared values) for the mediator and outcome models can be specified via r2m.x and r2y.mx arguments.

# continuous X
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
            r2m = 0.50, r2y = 0.50, 
            power = 0.80, alpha = 0.05)

In experimental design subjects are randomly assigned to treatment and control groups. The allocation usually takes place with a rate of 50% (or probability of 0.50) meaning that half of the sample is assigned to treatment and the other half to control. Thus, the mediator model is less likely to have confounder. It is more common to add covariates to the outcome model only.

# binary X
p <- 0.50 # proportion of subjects in one of the groups
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
            sdx = sqrt(p*(1-p)), r2y = 0.50, 
            power = 0.80, alpha = 0.05)



Analysis of (Co)Variance (F Test)

ANOVA: Analysis of Variance
ANCOVA: Analysis of Covariance

In a one-way ANOVA, a researcher might be interested in comparing the means of several groups (levels of a factor) with respect to a continuous variable. In a one-way ANCOVA, they may want to adjust means for covariates. Furthermore, they may want to inspect interaction between two or three factors (two-way or three-way ANOVA), or adjust interaction for covariates (two-way or three-way ANCOVA). The following examples illustrate how to determine minimum required sample size for such designs.

One-way

ANOVA

A researcher is expecting a difference of Cohen's d = 0.50 between treatment and control groups (two levels) translating into $\eta^2 = 0.059$ (eta2 = 0.059). Means are not adjusted for any covariates. What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.059, n.levels = 2,
               power = .80, alpha = 0.05)

ANCOVA

A researcher is expecting an adjusted difference of Cohen's d = 0.45 between treatment and control groups (n.levels = 2) after controlling for the pretest (n.cov = 1) translating into partial $\eta^2 = 0.048$ (eta2 = 0.048). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.048, n.levels = 2, n.cov = 1,
               alpha = 0.05, power = .80)
`n.cov` (or `n.covariates`) argument has trivial effect on the results. The difference between ANOVA and ANCOVA procedure depends on whether the effect (`eta2`) is unadjusted or covariate-adjusted.

Two-way

ANOVA

A researcher is expecting a partial $\eta^2 = 0.03$ (eta2 = 0.03) for interaction of treatment/control (Factor A: two levels) with gender (Factor B: two levels). Thus, n.levels = c(2,2). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.03, n.levels = c(2,2),
               alpha = 0.05, power = 0.80)

ANCOVA

A researcher is expecting a partial $\eta^2 = 0.02$ (eta2 = 0.02) for interaction of treatment/control (Factor A) with gender (Factor B) adjusted for the pretest (n.cov = 1). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.02, n.levels = c(2,2), n.cov = 1,
               alpha = 0.05, power = .80)

Three-way

ANOVA

A researcher is expecting a partial $\eta^2 = 0.02$ (eta2 = 0.02) for interaction of treatment/control (Factor A: two levels), gender (Factor B: two levels), and socio-economic status (Factor C: three levels). Thus, n.levels = c(2,2,3). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.02, n.levels = c(2,2,3),
               alpha = 0.05, power = 0.80)

ANCOVA

A researcher is expecting a partial $\eta^2 = 0.01$ (eta2 = 0.01) for interaction of treatment/control (Factor A), gender (Factor B), and socio-economic status (Factor C: three levels) adjusted for the pretest (n.cov = 1). What is the minimum required sample size?

pwrss.f.ancova(eta2 = 0.01, n.levels = c(2,2,3), n.cov = 1,
               alpha = 0.05, power = .80)



Repeated Measures (F Test)

The group (between) effect: In a one-way repeated measures ANOVA, a researcher might be interested in comparing the means of several groups (levels of a factor) with respect to an outcome variable (continuous) after controlling for the effect of time. The time effect can be thought as the improvement/grow/loss/deterioration in the outcome variable that happens naturally or due to unknown causes.

The time (within) effect: A researcher might be interested in comparing the means of several time points with respect to the outcome variable after controlling for the group effect. In other words, the change across time points is not due to the group membership but due to the improvement/grow/loss/deterioration in the outcome variable that happens naturally or due to unknown causes.

The group x time interaction: A researcher might suspect that means across groups and means across time points are not independent from each other. The amount of improvement/grow/loss/deterioration in the outcome variable depends the group membership.

The Group Effect (Between)

Example 1: Posttest only design with treatment and control groups.

A researcher is expecting a difference of Cohen's d = 0.50 on the posttest score between treatment and control groups (n.levels = 2), translating into $\eta^2 = 0.059$ (eta2 = 0.059). The test is administered at a single time point; thus, the 'number of repeated measures' is n.rm = 1. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.059,  n.levels = 2, n.rm = 1,
                power = 0.80, alpha = 0.05,
                type = "between")

The Time Effect (Within)

Example 2: Pretest-posttest design with treatment group only.

A researcher is expecting a difference of Cohen's d = 0.30 between posttest and pretest scores, translating into $\eta^2 = 0.022$ (eta2 = 0.022). The test is administered before and after the treatment; thus, the 'number of repeated measures' is n.rm = 2. There is treatment group but no control group (n.levels = 1). The researcher also expects a correlation of 0.50 (corr.rm = 0.50) between pretest and posttest scores. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.022,  n.levels = 1, n.rm = 2,
                power = 0.80, alpha = 0.05,
                corr.rm = 0.50, type = "within")

The Group x Time Interaction

Example 3: Pretest-posttest control-group design.

A researcher is expecting a difference of Cohen's d = 0.40 on the posttest scores between treatment and control groups (n.levels = 2) after controlling for pretest, translating into partial $\eta^2 = 0.038$ (eta2 = 0.038). The test is administered before and after the treatment; thus, the 'number of repeated measures' is n.rm = 2. The researcher also expects a correlation of 0.50 (corr.rm = 0.50) between pretest and posttest scores. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.038,  n.levels = 2, n.rm = 2,
                power = 0.80, alpha = 0.05, 
                corr.rm = 0.50, type = "between")

A researcher is expecting a difference of Cohen's d = 0.30 between posttest and pretest scores (n.rm = 2) after controlling for group membership, translating into partial $\eta^2 = 0.022$ (eta2 = 0.022). There is both treatment and control groups (n.levels = 2). The researcher also expects a correlation of 0.50 (corr.rm = 0.50) between pretest and posttest scores. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.022,  n.levels = 2, n.rm = 2,
                power = 0.80, alpha = 0.05, 
                corr.rm = 0.50, type = "within")

The rationale for inspecting the interaction is that the benefit of the treatment may depend on the pretest score (e.g. those with higher scores on the pretest improve or deteriorate more). A researcher is expecting an interaction effect of partial $\eta^2 = 0.01$ (eta2 = 0.01). The test is administered before and after the treatment; thus, the 'number of repeated measures' is n.rm = 2. There is both treatment and control groups (n.levels = 2). The researcher also expects a correlation of 0.50 (corr.rm = 0.50) between pretest and posttest scores. What is the minimum required sample size?

pwrss.f.rmanova(eta2 = 0.01,  n.levels = 2, n.rm = 2,
                power = 0.80, alpha = 0.05, 
                corr.rm = 0.50, type = "interaction")



GOF or Independence ($\chi^2$ Test)

Vector of k Cells

Effect size: Cohen's W for goodness-of-fit test (1 x k or k x 1 table).

How many subjects are needed to claim that girls choose STEM related majors less than males? Check the original article at https://www.aauw.org/resources/research/the-stem-gap/

Option 1: Use cell probabilities
Alternative hypothesis state that 28% of the workforce in STEM field is women whereas 72% is men (from the article).
The null hypothesis assume 50% is women and 50% is men.

prob.mat <- c(0.28, 0.72) 
pwrss.chisq.gofit(p1 = c(0.28, 0.72), 
                  p0 = c(0.50, 0.50),
                  alpha = 0.05, power = 0.80)

Option 2: Use Cohen's W = 0.44.
Degrees of freedom is k - 1 for Cohen's W.

pwrss.chisq.gofit(w = 0.44, df = 1,
                  alpha = 0.05, power = 0.80)

Contingency Table: 2 x 2

Effect size: Phi Coefficient (or Cramer's V or Cohen's W) for independence test (2 x 2 contingency table).

How many subjects are needed to claim that girls are underdiagnosed with ADHD?
Check the original article at https://time.com/growing-up-with-adhd/

Option 1: Use cell probabilities.
5.6% of girls and 13.2% of boys are diagnosed with ADHD (from the article).

prob.mat <- rbind(c(0.056, 0.132),
                  c(0.944, 0.868))
colnames(prob.mat) <- c("Girl", "Boy")
rownames(prob.mat) <- c("ADHD", "No ADHD")
prob.mat
pwrss.chisq.gofit(p1 = prob.mat,
                  alpha = 0.05, power = 0.80)

Option 2: Use Phi coefficient = 0.1302134.
Degrees of freedom is 1 for Phi coefficient.

pwrss.chisq.gofit(w = 0.1302134, df = 1,
                  alpha = 0.05, power = 0.80)

Contingency Table: j x k

Effect size: Cramer's V (or Cohen's W) for independence test (j x k contingency tables).

How many subjects are needed to detect the relationship between depression severity and gender?
Check the original article at https://doi.org/10.1016/j.jad.2019.11.121

Option 1: Use cell probabilities (from the article).

prob.mat <- cbind(c(0.6759, 0.1559, 0.1281, 0.0323, 0.0078),
                  c(0.6771, 0.1519, 0.1368, 0.0241, 0.0101))
rownames(prob.mat) <- c("Normal", "Mild", "Moderate", "Severe", "Extremely Severe")
colnames(prob.mat) <- c("Female", "Male")
prob.mat
pwrss.chisq.gofit(p1 = prob.mat,
                  alpha = 0.05, power = 0.80)

Option 2: Use Cramer's V = 0.03022008 based on 5 x 2 contingency table
Degrees of freedom is (nrow - 1) * (ncol - 1) for Cramer's V.

pwrss.chisq.gofit(w = 0.03022008, df = 4,
                  alpha = 0.05, power = 0.80)



Correlation(s) (z Test)

One Correlation

One-sided Test: Assume that the expected correlation is 0.20 and it is greater than 0.10. What is the minimum required sample size?

pwrss.z.corr(r = 0.20, r0 = 0.10,
             power = 0.80, alpha = 0.05, 
             alternative = "greater")

Two-sided Test: Assume that the expected correlation is 0.20 and it is different from 0 (zero). The correlation could be 0.20 as well as -0.20. What is the minimum required sample size?

pwrss.z.corr(r = 0.20, r0 = 0,
             power = 0.80, alpha = 0.05, 
             alternative = "not equal")

Difference between Two Correlations (Independent)

Assume that the expected correlations in the first and second groups are 0.30 and 0.20, respectively (r1 = 0.30 and r2 = 0.20).

One-sided Test: Expecting r1 - r2 greater than 0 (zero). The difference could be 0.10 but could not be -0.10. What is the minimum required sample size?

pwrss.z.2corrs(r1 = 0.30, r2 = 0.20,
               power = .80, alpha = 0.05, 
               alternative = "greater")

Two-sided Test: Expecting r1 - r2 different from 0 (zero). The difference could be -0.10 as well as 0.10. What is the minimum required sample size?

pwrss.z.2corrs(r1 = 0.30, r2 = 0.20,
               power = .80, alpha = 0.05, 
               alternative = "not equal")



Proportion(s) (z Test)

One Proportion

In the following examples p is the proportion under alternative hypothesis and p0 is the proportion under null hypothesis.

One-sided Test: Expecting p - p0 smaller than 0 (zero).

# normal approximation
pwrss.z.prop(p = 0.45, p0 = 0.50,
             alpha = 0.05, power = 0.80,
             alternative = "less",
             arcsin.trans = FALSE)

# arcsine transformation
pwrss.z.prop(p = 0.45, p0 = 0.50,
             alpha = 0.05, power = 0.80,
             alternative = "less",
             arcsin.trans = TRUE)

Two-sided Test: Expecting p - p0 smaller than 0 (zero) or greater than 0 (zero).

pwrss.z.prop(p = 0.45, p0 = 0.50,
             alpha = 0.05, power = 0.80,
             alternative = "not equal")

Non-inferiority Test: The case when smaller proportion is better. Expecting p - p0 smaller than 0.01.

pwrss.z.prop(p = 0.45, p0 = 0.50, margin = 0.01,
             alpha = 0.05, power = 0.80,
             alternative = "non-inferior")

Non-inferiority Test: The case when bigger proportion is better. Expecting p - p0 greater than -0.01.

pwrss.z.prop(p = 0.55, p0 = 0.50, margin = -0.01,
             alpha = 0.05, power = 0.80,
             alternative = "non-inferior")

Superiority Test: The case when smaller proportion is better. Expecting p - p0 smaller than -0.01.

pwrss.z.prop(p = 0.45, p0 = 0.50, margin = -0.01,
             alpha = 0.05, power = 0.80,
             alternative = "superior")

Superiority Test: The case when bigger proportion is better. Expecting p - p0 greater than 0.01.

pwrss.z.prop(p = 0.55, p0 = 0.50, margin = 0.01,
             alpha = 0.05, power = 0.80,
             alternative = "superior")

Equivalence Test: Expecting p - p0 between -0.01 and 0.01.

pwrss.z.prop(p = 0.50, p0 = 0.50, margin = 0.01,
             alpha = 0.05, power = 0.80,
             alternative = "equivalent")

Difference between Two Proportions (Independent)

In the following examples p1 and p2 are proportions for the first and second groups under alternative hypothesis. The null hypothesis state p1 = p2 or p1 - p2 = 0.

One-sided Test: Expecting p1 - p2 smaller than 0 (zero).

# normal approximation
pwrss.z.2props(p1 = 0.45, p2 = 0.50,
               alpha = 0.05, power = 0.80,
               alternative = "less",
               arcsin.trans = FALSE)
# arcsine transformation
pwrss.z.2props(p1 = 0.45, p2 = 0.50,
               alpha = 0.05, power = 0.80,
               alternative = "less",
               arcsin.trans = TRUE)

Two-sided Test: Expecting p1 - p2 smaller than 0 (zero) or greater than 0 (zero).

pwrss.z.2props(p1 = 0.45, p2 = 0.50,
               alpha = 0.05, power = 0.80,
               alternative = "not equal")

Non-inferiority Test: The case when smaller proportion is better. Expecting p1 - p2 smaller than 0.01.

pwrss.z.2props(p1 = 0.45, p2 = 0.50, margin = 0.01,
               alpha = 0.05, power = 0.80,
               alternative = "non-inferior")

Non-inferiority Test: The case when bigger proportion is better. Expecting p1 - p2 greater than -0.01.

pwrss.z.2props(p1 = 0.55, p2 = 0.50,  margin = -0.01,
               alpha = 0.05, power = 0.80,
               alternative = "non-inferior")

Superiority Test: The case when smaller proportion is better. Expecting p1 - p2 smaller than -0.01.

pwrss.z.2props(p1 = 0.45, p2 = 0.50, margin = -0.01,
               alpha = 0.05, power = 0.80,
               alternative = "superior")

Superiority Test: The case when bigger proportion is better. Expecting p1 - p2 greater than 0.01.

pwrss.z.2props(p1 = 0.55, p2 = 0.50, margin = 0.01,
               alpha = 0.05, power = 0.80,
               alternative = "superior")

Equivalence Test: Expecting p1 - p2 between -0.01 and 0.01.

pwrss.z.2props(p1 = 0.50, p2 = 0.50, margin = 0.01,
               alpha = 0.05, power = 0.80,
               alternative = "equivalent")
Please send any bug reports, feedback or questions to [bulusmetin [at] gmail.com](mailto:bulusmetin@gmail.com).

--o--

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License .

License



Try the pwrss package in your browser

Any scripts or data that you put into this service are public.

pwrss documentation built on April 12, 2023, 12:34 p.m.