findgenes: Find differentially expressed genes after GaGa or...
In gaga: GaGa hierarchical model for high-throughput data analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

Obtains a list of differentially expressed genes using the posterior probabilities from a GaGa, MiGaGa or Normal-Normal fit. For parametric==TRUE the procedure controls the Bayesian FDR below fdrmax. For parametric==FALSE it controls the estimated frequentist FDR (only available for GaGa).

1	findgenes(fit, x, groups, fdrmax=.05, parametric=TRUE, B=500)

`fit`	Either GaGa/MiGaGa fit (object of class `gagafit`, as returned by `fitGG`) or Normal-Normal fit (class `nnfit`, as returned by `fitNN`).
`x`	`ExpressionSet`, `exprSet`, data frame or matrix containing the gene expression measurements used to fit the model.
`groups`	If `x` is of type `ExpressionSet` or `exprSet`, `groups` should be the name of the column in `pData(x)` with the groups that one wishes to compare. If `x` is a matrix or a data frame, `groups` should be a vector indicating to which group each column in x corresponds to.
`fdrmax`	Upper bound on FDR.

`parametric`	Set to `TRUE` to control the posterior expected FDR below `fdrmax`. Set to `FALSE` to estimate the frequentist FDR non-parametrically (only available when fit is of class `gagafit`).
`B`	Number of boostrap samples to estimate FDR non-parametrically (ignored if `parametric==TRUE`)

The Bayes rule to minimize posterior expected FNR subject to posterior expected FDR <=fdrmax declares differentially expressed all genes with posterior probability of being equally expressed below a certain threshold. The value of the threshold is computed exactly for parametric==TRUE, FDR being defined in a Bayesian sense. For parametric==FALSE the FDR is defined in a frequentist sense.

List with components:

`truePos`	Expected number of true positives.
`d`	Vector indicating the pattern that each gene is assigned to.
`fdr`	Frequentist estimated FDR that is closest to fdrmax.
`fdrpar`	Bayesian FDR. If `parametric==TRUE`, this is equal to `fdrmax`. If `parametric==FALSE`, it's the Bayesian FDR needed to achieve frequentist estimated FDR=`fdrmax`.
`fdrest`	Data frame with estimated frequentist FDR for each target Bayesian FDR
`fnr`	Bayesian FNR
`power`	Bayesian power as estimated by expected number of true positives divided by the expected number of differentially expressed genes
`threshold`	Optimal threshold for posterior probability of equal expression (genes with probability < `threshold` are declared DE)

David Rossell

Rossell D. (2009) GaGa: a Parsimonious and Flexible Model for Differential Expression Analysis. Annals of Applied Statistics, 3, 1035-1051.

Yuan, M. and Kendziorski, C. (2006). A unified approach for simultaneous gene clustering and differential expression identification. Biometrics 62(4): 1089-1098.

Muller P, Parmigiani G, Robert C, Rousseau J. (2004) Journal of the American Statistical Association, 99(468): 990-1001.

fitGG, fitNN, parest

#Not run. Example from the help manual
#library(gaga)
#set.seed(10)
#n <- 100; m <- c(6,6)
#a0 <- 25.5; nu <- 0.109
#balpha <- 1.183; nualpha <- 1683
#probpat <- c(.95,.05)
#xsim <- simGG(n,m,p.de=probpat[2],a0,nu,balpha,nualpha)
#
#ggfit <- fitGG(xsim$x[,c(-6,-12)],groups,patterns=patterns,nclust=1)
#ggfit <- parest(ggfit,x=xsim$x[,c(-6,-12)],groups,burnin=100,alpha=.05)
#
#d <- findgenes(ggfit,xsim$x[,c(-6,-12)],groups,fdrmax=.05,parametric=TRUE)
#dtrue <- (xsim$l[,1]!=xsim$l[,2])
#table(d$d,dtrue)