Description Usage Arguments Details Value Author(s) References Examples
Simulate Genotype Data from a Mixture of 3 Bayesian Hierarchical Models. The minor allele frequency (MAF) of cases has different priors than that of controls.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20  | simGenoFuncDiffPriors(
    nCases = 100, 
    nControls = 100, 
    nSNPs = 1000, 
    alpha.p.ca = 2, 
    beta.p.ca = 3, 
    alpha.p.co = 2, 
    beta.p.co = 8, 
    pi.p = 0.1, 
    alpha0 = 2, 
    beta0 = 5, 
    pi0 = 0.8, 
    alpha.n.ca = 2, 
    beta.n.ca = 8, 
    alpha.n.co = 2, 
    beta.n.co = 3, 
    pi.n = 0.1, 
    low = 0.02, 
    upp = 0.5, 
    verbose = FALSE)
 | 
nCases | 
 integer. Number of cases.  | 
nControls | 
 integer. Number of controls.  | 
nSNPs | 
 integer. Number of SNPs.  | 
alpha.p.ca | 
 numeric. The first shape parameter of Beta prior in cluster + for cases.  | 
beta.p.ca | 
 numeric. The second shape parameter of Beta prior in cluster + for cases.  | 
alpha.p.co | 
 numeric. The first shape parameter of Beta prior in cluster + for controls.  | 
beta.p.co | 
 numeric. The second shape parameter of Beta prior in cluster + for controls.  | 
pi.p | 
 numeric. Mixture proportion for cluster +.  | 
alpha0 | 
 numeric. The first shape parameter of Beta prior in cluster 0.  | 
beta0 | 
 numeric. The second shape parameter of Beta prior in cluster 0.  | 
pi0 | 
 numeric. Mixture proportion for cluster 0.  | 
alpha.n.ca | 
 numeric. The first shape parameter of Beta prior in cluster - for cases.  | 
beta.n.ca | 
 numeric. The second shape parameter of Beta prior in cluster - for cases.  | 
alpha.n.co | 
 numeric. The first shape parameter of Beta prior in cluster - for controls.  | 
beta.n.co | 
 numeric. The second shape parameter of Beta prior in cluster - for controls.  | 
pi.n | 
 numeric. Mixture proportion for cluster -.  | 
low | 
 numeric. A small positive value. If a MAF generated from half-flat shape
bivariate prior is smaller than   | 
upp | 
 numeric. A positive value. If a MAF generated from half-flat shape
bivariate prior is greater than   | 
verbose | 
 logical. Indicating if intermediate results or final results should be output to output screen.  | 
In this simulation, we generate additive-coded genotypes for 3 clusters of SNPs based on a mixture of 3 Bayesian hierarchical models.
In cluster +, the minor allele frequency (MAF) θ_{x+} of cases is greater than the MAF θ_{y+} of controls.
In cluster 0, the MAF θ_{0} of cases is equal to the MAF of controls.
In cluster -, the MAF θ_{x-} of cases is smaller than the MAF θ_{y-} of controls.
The proportions of the 3 clusters of SNPs are π_{+}, π_{0}, and π_{-}, respectively.
We assume a “half-flat shape” bivariate prior for the MAF in cluster +
2h_{+}≤ft(θ_{x+}\right)h_{+}≤ft(θ_{y+}\right) I≤ft(θ_{x+}>θ_{y+}\right),
where I(a) is hte indicator function taking value 1 if the event a is true, and value 0 otherwise. The function h_{+} is the probability density function of the beta distribution Beta≤ft(α_{+}, β_{+}\right).
We assume θ_{0} has the beta prior Beta(α_0, β_0).
We also assume a “half-flat shape” bivariate prior for the MAF in cluster -
2h_{-}≤ft(θ_{x-}\right)h_{-}≤ft(θ_{y-}\right) I≤ft(θ_{x-}>θ_{y-}\right).
The function h_{-} is the probability density function of the beta distribution Beta≤ft(α_{-}, β_{-}\right).
Given a SNP, we assume Hardy-Weinberg equilibrium holds for its genotypes. That is, given MAF θ, the probabilities of genotypes are
Pr(geno=2) = θ^2
Pr(geno=1) = 2θ≤ft(1-θ\right)
Pr(geno=0) = ≤ft(1-θ\right)^2
We also assume the genotypes 0 (wild-type), 1 (heterozygote), and 2 (mutation) follows a multinomial distribution Multinomial≤ft\{1, ≤ft[ θ^2, 2θ≤ft(1-θ\right), ≤ft(1-θ\right)^2 \right]\right\}
Note that when we generate MAFs from the half-flat shape bivariate priors, we might get very small MAFs or get MAFs >0.5. In these cased, we then delete this SNP.
So the final number of SNPs generated might be less than the initially-set number of SNPs.
An ExpressionSet object stores genotype data.
Yan Xu <yanxu@uvic.ca>, Li Xing <sfulxing@gmail.com>, Jessica Su <rejas@channing.harvard.edu>, Xuekui Zhang <xuekui@uvic.ca>, Weiliang Qiu <Weiliang.Qiu@gmail.com>
Yan X, Xing L, Su J, Zhang X, Qiu W. Model-based clustering for identifying disease-associated SNPs in case-control genome-wide association studies. Scientific Reports 9, Article number: 13686 (2019) https://www.nature.com/articles/s41598-019-50229-6.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  | set.seed(2)
esSimDiffPriors = simGenoFuncDiffPriors(
  nCases = 100,
  nControls = 100,
  nSNPs = 500,
  alpha.p.ca = 2, beta.p.ca = 3,
  alpha.p.co = 2, beta.p.co = 8, pi.p = 0.1,
  alpha0 = 2, beta0 = 5, pi0 = 0.8,
  alpha.n.ca = 2, beta.n.ca = 8,
  alpha.n.co = 2, beta.n.co = 3, pi.n = 0.1,
  low = 0.02, upp = 0.5, verbose = FALSE
)
print(esSimDiffPriors)
pDat = pData(esSimDiffPriors)
print(pDat[1:2,])
print(table(pDat$memSubjs))
fDat = fData(esSimDiffPriors)
print(fDat[1:2,])
print(table(fDat$memGenes))
print(table(fDat$memGenes2))
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.