Description Usage Arguments Details Value Author(s) References Examples
View source: R/powerEQTL.ANOVA.R
Power calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using unbalanced oneway ANOVA. This function can be used to calculate one of the 3 parameters (power, sample size, and minimum allowable MAF) by setting the corresponding parameter as NULL and providing values for the other 2 parameters.
1 2 3 4 5 6 7 8 9  powerEQTL.ANOVA(MAF,
deltaVec=c(0.13, 0.13),
n=200,
power = NULL,
sigma = 0.13,
FWER = 0.05,
nTests = 200000,
n.lower = 4,
n.upper = 1e+30)

MAF 
numeric. Minor allele frequency. 
deltaVec 
numeric. A vector having 2 elements. The first element is equal to mu_2  mu_1 and the second element is equal to mu_3  mu_2, where mu_1 is the mean gene expression level for the mutation homozygotes, mu_2 is the mean gene expression level for the heterozygotes, and mu_3 is the mean gene expression level for the wildtype gene expression level. 
n 
integer. Total number of subjects. 
power 
numeric. Power for testing if 3 genotypes have the same mean gene expression levels. 
sigma 
numeric. Standard deviation of the random error. 
FWER 
numeric. Familywise Type I error rate. 
nTests 
integer. Number of tests (i.e., number of all (SNP, gene) pairs) in eQTL analysis. 
n.lower 
numeric. Lower bound of the total number of subjects. Only used when calculating total number of subjects. 
n.upper 
numeric. Upper bound of the total number of subjects. Only used when calculating total number of subjects. 
If we would like to test potential nonlinear relationship between genotype of a SNP and expression of a gene, we can use unbalanced oneway ANOVA. Actually, an article published by the GTEx Consortium in 2013 used this approach.
Suppose there are k=3 groups of subjects: (1) mutation homozygotes; (2) heterozygotes; and (3) wildtype homozygotes. We would like to test if the mean expression μ_i, i=1, …, k, of the gene is the same among the k groups of subjects. We can use the following oneway ANOVA model to characterize the relationship between observed gene expression level y_{ij} and the population mean expression level μ_i:
y_{ij} = μ_i + ε_{ij}, \quad ε_{ij} \sim N≤ft(0, σ^2\right),
where i=1,…, k, j= 1, …, n_i, y_{ij} is the observed gene expression level for the jth subject in the ith group, μ_i is the mean gene expression level of the ith group, ε_{ij} is the random error, k is the number of groups, n_i is the number of subjects in the ith group. Denote the total number of subjects as N = ∑_{i=1}^{k} n_i. That is, we have n_1 mutation homozygotes, n_2 heterozygotes, and n_3 wildtype homozygotes.
We would like to test the null hypothesis H_0 and alternative hypothesis H_1:
H_0: μ_1 = μ_2 = μ_3,
H_1: \mbox{not all means are the same}.
According to O'Brien and Muller (1993), the power calculation formula for unbalanced oneway ANOVA is
power = Pr(F >= F(1  alpha, k  1, N  k) F ~ F(k  1, N  k, lambda)),
where k = 3 is the number of groups of subjects, N is the total number of subjects, F_{1  alpha}(k  1, N  k) is the 100 * (1  alpha)th percentile of central F distribution with degrees of freedoms k  1 and N  k, and F_{k  1, N  k, lambda} is the noncentral F distribution with degrees of freedoms k  1 and N  k and noncentral parameter (ncp) lambda. The ncp lambda is equal to
lambda = N * sum(wi * (mu_i  mu)^2, i = 1,.., k)/sigma^2,
where mu_i is the mean gene expression level for the ith group of subjects, w_i is the weight for the ith group of subjects, sigma^2 is the variance of the random errors in ANOVA (assuming each group has equal variance), and mu is the weighted mean gene expression level
mu = sum(w_i * mu_i, i = 1, ..., k).
The weights w_i=n_i/N are the sample proportions for the 3 groups of subjects, where N=n_1+n_2+n_3 is the total number of subjects. Hence, sum(w_i, i = 1, 2, 3) = 1. Based on HardyWeinberg Equilibrium, we have w_1 = θ^2, w_2 = 2θ(1θ), and w_3 = (1θ)^2, where θ is MAF.
Without loss of generality, we set μ_1 = δ_1, μ_2=0, and μ_3 = δ_2.
We adopted the parameters from the GTEx cohort (see the Power analysis" section of Nature Genetics, 2013; https://www.nature.com/articles/ng.2653), where they modeled the expression data as having a lognormal distribution with a log standard deviation of 0.13 within each genotype class (AA, AB, BB). This level of noise is based on estimates from initial GTEx data. In their power analysis, they assumed the acrossgenotype difference delta = 0.13 (i.e., equivalent to detecting a log expression change similar to the standard deviation within a single genotype class).
power if the input parameter power = NULL
.
sample size (total number of subjects) if the input parameter n = NULL
;
minimum detectable slope if the input parameter slope = NULL
.
Xianjun Dong <XDONG@rics.bwh.harvard.edu>, Xiaoqi Li<xli85@bwh.harvard.edu>, TzuuWang Chang <Chang.TzuuWang@mgh.harvard.edu>, Scott T. Weiss <restw@channing.harvard.edu>, Weiliang Qiu <weiliang.qiu@gmail.com>
The GTEx Consortium. The GenotypeTissue Expression (GTEx) project. Nature Genetics, 45:580585, 2013.
O'Brien, RG and Muller, KE. Unified power analysis for ttests through multivariate hypotheses. In LK Edwards, editor, Applied Analysis of Variance in Behavioral Science, pages 297344. New York: Dekker, 1993.
Dong X, Li X, Chang TW, Scherzer CR, Weiss ST, and Qiu W. powerEQTL: An R package and shiny application for sample size and power calculation of bulk tissue and singlecell eQTL analysis. Bioinformatics, 2021;, btab385
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  # calculate power
powerEQTL.ANOVA(MAF = 0.1,
deltaVec = c(0.13, 0.13),
n = 282,
power = NULL,
sigma = 0.13,
FWER = 0.05,
nTests = 200000)
# calculate sample size (total number of subjects)
powerEQTL.ANOVA(MAF = 0.1,
deltaVec = c(0.13, 0.13),
n = NULL,
power = 0.8,
sigma = 0.13,
FWER = 0.05,
nTests = 200000)
# calculate minimum allowable MAF
powerEQTL.ANOVA(MAF = NULL,
deltaVec = c(0.13, 0.13),
n = 282,
power = 0.8,
sigma = 0.13,
FWER = 0.05,
nTests = 200000)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.