ssEQTL.ANOVA2: Sample Size Calculation for EQTL Analysis Based on...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Sample size calculation for eQTL analysis that tests if a SNP is associated to a gene probe by using un-balanced one-way ANOVA.

Usage

1
2
3
4
5
6
7
ssEQTL.ANOVA2(
  effsize,
  MAF,
  typeI = 0.05,
  nTests = 2e+05,
  mypower = 0.8
)

Arguments

effsize

effect size delta / sigma, where delta = mu_2 - m_1 = mu_3 - mu_2, mu_1, mu_2, mu_3 are the mean gene expression level of mutation homozygotes, heterozygotes, and wild-type homozygotes, and sigma is the standard deviation of gene expression levels (assuming each genotype group has the same variance).

MAF

Minor allele frequency.

typeI

Type I error rate for testing if a SNP is associated to a gene probe.

nTests

integer. Number of tests in eQTL analysis.

mypower

Desired power for the eQTL analysis.

Details

The assumption of the ANOVA approach is that the association of a SNP to a gene probe is tested by using un-balanced one-way ANOVA (e.g. Lonsdale et al. 2013). According to SAS online document https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_power_a0000000982.htm, the power calculation formula is

power = Pr(F >= F(1 - alpha, k - 1, N - k)| F ~ F(k - 1, N - k, lambda)),

where k = 3 is the number of groups of subjects, N is the total number of subjects, F_{1 - alpha}(k - 1, N - k) is the 100 * (1 - alpha)-th percentile of central F distribution with degrees of freedoms k - 1 and N - k, and F_{k - 1, N - k, lambda} is the non-central F distribution with degrees of freedoms k - 1 and N - k and non-central parameter (ncp) lambda. The ncp lambda is equal to

lambda = N * sum(wi * (mu_i - mu)^2, i = 1,.., k)/sigma^2,

where mu_i is the mean gene expression level for the i-th group of subjects, w_i is the weight for the i-th group of subjects, sigma^2 is the variance of the random errors in ANOVA (assuming each group has equal variance), and mu is the weighted mean gene expression level

mu = sum(w_i * mu_i, i = 1, ..., k).

The weights w_i are the sample proportions for the 3 groups of subjects. Hence, sum(w_i, i = 1, 2, 3) = 1.

We assume that mu_2 - mu_1 = mu_3 - mu_2 = delta, where mu_1, mu_2, and mu_3 are the mean gene expression level for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively.

Denote p as the minor allele frequency (MAF) of a SNP. Under Hardy-Weinberg equilibrium, we have genotype frequencies: p_2 = p^2, p_1 = 2 * p * q, and p_0 = q^2, where p_2, p_1, and p_0 are genotype for mutation homozygotes, heterozygotes, and wild-type homozygotes, respectively, q = 1 - p. Then ncp can be simplified as

ncp = 2 * p * q * N * (delta/sigma)^2.

Value

sample size required for the eQTL analysis to achieve the desired power.

Author(s)

Xianjun Dong <XDONG@rics.bwh.harvard.edu>, Tzuu-Wang Chang <Chang.Tzuu-Wang@mgh.harvard.edu>, Scott T. Weiss <restw@channing.harvard.edu>, Weiliang Qiu <stwxq@channing.harvard.edu>

References

Lonsdale J and Thomas J, et al. The Genotype-Tissue Expression (GTEx) project. Nature Genetics, 45:580-585, 2013.

See Also

minEffectEQTL.ANOVA, powerEQTL.ANOVA, powerEQTL.ANOVA2, ssEQTL.ANOVA

Examples

1
2
3
4
5
6
7
ssEQTL.ANOVA2(
  effsize = 1,
  MAF = 0.1,
  typeI = 0.05,
  nTests = 2e+05,
  mypower = 0.8
)

sterding/powerEQTL documentation built on May 30, 2019, 4:42 p.m.