nebula.chisq.train: Nonparametric empirical Bayes classifier using latent...

Description Usage Arguments Value Examples

Description

Assumes that chi-square test statistics for each SNP are available from another study. Treats the true control and case minor allele frequencies and the chi-square non-centrality parameters as random triples from a bivariate prior distribution G, and estimates the optimal Bayesian classifier given G. Nonparametric maximum likelihood is used as a plug-in estimator for G.

Usage

1
2
nebula.chisq.train(pi0, pi1, n0, n1, T, d = 25, maxit = 200,
  tol = 1e-04, verbose = FALSE)

Arguments

pi0, pi1

p x 1 vectors of control and case minor allele frequencies, respectively; IMPORTANT: must be relative to the same allele in both cases and controls

n0, n1

number of controls and number of cases, respectively

T

p x 1 vector of chi-square test statistics

d

if a single number, G is estimated on a d x d x d grid; if a three-component vector (d0,d1,dt), G is estimated on a d0 x d1 x dt grid

maxit

maximum number of EM iterations

tol

error tolerance

verbose

TRUE to print the error attained by each EM iteration

Value

Pi0

grid points for estimating the distribution of the control minor allele frequencies

Pi1

grid points for estimating the distribution of the case minor allele frequencies

Lam

grid points for estimating the distribution of the non-centrality parameter

D0

conditional density matrix for controls

D1

conditional density matrix for cases

DT

conditional density matrix for test statistics

g

estimated mixing probability mass function

P

proportion of cases

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
p <- 1000; ## number of snps
I <- rep(0,p); I[1:10] <- 1; ## which snps are causal
set.seed(1); pi0 <- runif(p,0.1,0.5); ## control minor allele frequencies
set.seed(1); ors <- runif(sum(I),-1,1); ## odds ratios
pi1 <- pi0;
pi1[I==1] <- expit(ors+logit(pi0[I==1]));
set.seed(1); lam <- rep(0,p); lam[I==1] <- rchisq(sum(I==1),1,25); ## ncps
## training data
n0 <- 100; ## number of controls
X0 <- t(replicate(n0,rbinom(p,2,pi0))); ## controls
n1 <- 50; ## number of cases
X1 <- t(replicate(n1,rbinom(p,2,pi1))); ## cases
T <- rchisq(p,1,lam); ## chi-square statistics
nebula <- nebula.chisq.train(colMeans(X0)/2,colMeans(X1)/2,n0,n1,T,d=c(20,25,30));
par(mfrow=c(1,3));
contour(nebula$Pi0,nebula$Pi1,apply(nebula$g,c(1,2),sum));
points(pi0,pi1);
contour(nebula$Pi0,nebula$Lam,apply(nebula$g,c(1,3),sum));
points(pi0,lam);
contour(nebula$Pi1,nebula$Lam,apply(nebula$g,c(2,3),sum));
points(pi1,lam);

sdzhao/ssa documentation built on May 18, 2019, 2:36 p.m.