emfit: Implements EM algorithm for gene expression mixture model
In EBarrays: Unified Approach for Simultaneous Gene Clustering and Differential Expression Identification

Description Usage Arguments Details Value Author(s) References See Also Examples

Implements the EM algorithm for gene expression mixture model

emfit(data,
      family,
      hypotheses,
      ...)

`data`	a matrix
`family`	an object of class “ebarraysFamily” or a character string which can be coerced to one. Currently, only the characters "GG" and "LNN", and "LNNMV" are valid. For LNNMV, a `groupid` is required. See below. Other families can be supplied by constructing them explicitly.
`hypotheses`	an object of class “ebarraysPatterns” representing the hypotheses of interest. Such patterns can be generated by the function `ebPatterns`
`...`	other arguments. These include: cluster if `type`=1, `cluster` is a vector specifying the fixed cluster membership for each gene; if `type`=2, `cluster` specifies the number of clusters to be fitted type if `type`=1, the cluster membership is fixed as input `cluster`; if `type`=2, fit the data with a fixed number of clusters criterion only used when `type`=2 and `cluster` contains more than one integers. All numbers of clusters provided in `cluster` will be fitted and the one that minimizes `criterion` will be returned. Possible values now are "BIC", "AIC" and "HQ" cluster.init only used when `type`=2. Specify the initial clustering membership. num.iter number of EM iterations verbose logical or numeric (0,1,2) indicating desired level of information printed for the user optim.control list passed unchanged to `optim` for finer control groupid an integer vector indicating which group each sample belongs to, required in the “LNNMV” model. It does not depend on “hypotheses”.

There are many optional arguments. So a call might look more like this:

emfit(data, family, hypotheses, cluster, type=2, criterion="BIC", cluster.init = NULL, num.iter = 20, verbose = getOption("verbose"), optim.control = list(), ...)

an object of class “ebarraysEMfit”, that can be summarized by show() and used to generate posterior probabilities using postprob

Ming Yuan, Ping Wang, Deepayan Sarkar, Michael Newton, and Christina Kendziorski

Newton, M.A., Kendziorski, C.M., Richmond, C.S., Blattner, F.R. (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology 8:37-52.

Kendziorski, C.M., Newton, M.A., Lan, H., Gould, M.N. (2003). On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine 22:3899-3914.

Newton, M.A. and Kendziorski, C.M. Parametric Empirical Bayes Methods for Microarrays in The analysis of gene expression data: methods and software. Eds. G. Parmigiani, E.S. Garrett, R. Irizarry and S.L. Zeger, New York: Springer Verlag, 2003.

Newton, M.A., Noueiry, A., Sarkar, D., and Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture model. Biostatistics 5: 155-176.

Yuan, M. and Kendziorski, C. (2006). A unified approach for simultaneous gene clustering and differential expression identification. Biometrics 62(4): 1089-1098.

ebPatterns, ebarraysFamily-class

data(sample.ExpressionSet) ## from Biobase
eset <- exprs(sample.ExpressionSet)
patterns <- ebPatterns(c("1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1",
                         "1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2"))
gg.fit <- emfit(data = eset, family = "GG", hypotheses = patterns, verbose = TRUE)
show(gg.fit)