Find differentially expressed genes after GaGa or Normal-Normal fit.

Share:

Description

Obtains a list of differentially expressed genes using the posterior probabilities from a GaGa, MiGaGa or Normal-Normal fit. For parametric==TRUE the procedure controls the Bayesian FDR below fdrmax. For parametric==FALSE it controls the estimated frequentist FDR (only available for GaGa).

Usage

1
findgenes(fit, x, groups, fdrmax=.05, parametric=TRUE, B=500)

Arguments

fit

Either GaGa/MiGaGa fit (object of class gagafit, as returned by fitGG) or Normal-Normal fit (class nnfit, as returned by fitNN).

x

ExpressionSet, exprSet, data frame or matrix containing the gene expression measurements used to fit the model.

groups

If x is of type ExpressionSet or exprSet, groups should be the name of the column in pData(x) with the groups that one wishes to compare. If x is a matrix or a data frame, groups should be a vector indicating to which group each column in x corresponds to.

fdrmax

Upper bound on FDR.

.

parametric

Set to TRUE to control the posterior expected FDR below fdrmax. Set to FALSE to estimate the frequentist FDR non-parametrically (only available when fit is of class gagafit).

B

Number of boostrap samples to estimate FDR non-parametrically (ignored if parametric==TRUE)

Details

The Bayes rule to minimize posterior expected FNR subject to posterior expected FDR <=fdrmax declares differentially expressed all genes with posterior probability of being equally expressed below a certain threshold. The value of the threshold is computed exactly for parametric==TRUE, FDR being defined in a Bayesian sense. For parametric==FALSE the FDR is defined in a frequentist sense.

Value

List with components:

truePos

Expected number of true positives.

d

Vector indicating the pattern that each gene is assigned to.

fdr

Frequentist estimated FDR that is closest to fdrmax.

fdrpar

Bayesian FDR. If parametric==TRUE, this is equal to fdrmax. If parametric==FALSE, it's the Bayesian FDR needed to achieve frequentist estimated FDR=fdrmax.

fdrest

Data frame with estimated frequentist FDR for each target Bayesian FDR

fnr

Bayesian FNR

power

Bayesian power as estimated by expected number of true positives divided by the expected number of differentially expressed genes

threshold

Optimal threshold for posterior probability of equal expression (genes with probability < threshold are declared DE)

Author(s)

David Rossell

References

Rossell D. (2009) GaGa: a Parsimonious and Flexible Model for Differential Expression Analysis. Annals of Applied Statistics, 3, 1035-1051.

Yuan, M. and Kendziorski, C. (2006). A unified approach for simultaneous gene clustering and differential expression identification. Biometrics 62(4): 1089-1098.

Muller P, Parmigiani G, Robert C, Rousseau J. (2004) Journal of the American Statistical Association, 99(468): 990-1001.

See Also

fitGG, fitNN, parest

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#Not run. Example from the help manual
#library(gaga)
#set.seed(10)
#n <- 100; m <- c(6,6)
#a0 <- 25.5; nu <- 0.109
#balpha <- 1.183; nualpha <- 1683
#probpat <- c(.95,.05)
#xsim <- simGG(n,m,p.de=probpat[2],a0,nu,balpha,nualpha)
#
#ggfit <- fitGG(xsim$x[,c(-6,-12)],groups,patterns=patterns,nclust=1)
#ggfit <- parest(ggfit,x=xsim$x[,c(-6,-12)],groups,burnin=100,alpha=.05)
#
#d <- findgenes(ggfit,xsim$x[,c(-6,-12)],groups,fdrmax=.05,parametric=TRUE)
#dtrue <- (xsim$l[,1]!=xsim$l[,2])
#table(d$d,dtrue)