Class of Families to be used in the EBarrays package


Objects used as family in the emfit function.

The package contains three functions that create such objects for the three most commonly used families, Gamma-Gamma, Lognormal-Normal and Lognormal-Normal with modified variances. Users may create their own families as well.




The emfit function can potentially fit models corresponding to several different Bayesian conjugate families. This is specified as the family argument, which ultimately has to be an object of formal class “ebarraysFamily” with some specific slots that determine the behavior of the ‘family’.

For users who are content to use the predefined GG, LNN and LNNMV models, no further details than that given in the documentation for emfit are necessary. If you wish to create your own families, read on.


Objects of class “ebarraysFamily” for the three predefined families Gamma-Gamma , Lognormal-Normal and Lognormal-Normal with modified variances.

Objects from the Class

Objects of class “ebarraysFamily” can be created by calls of the form new("ebarraysFamily", ...). Predefined objects corresponding to the GG, LNN and LNNMV models can be created by eb.createFamilyGG() , eb.createFamilyLNN() and eb.createFamilyLNNMV(). The same effect is achieved by coercing from the strings "GG", "LNN" and "LNNMV" by as("GG", "ebarraysFamily"), as("LNN", "ebarraysFamily") and as("LNNMV", "ebarraysFamily").


An object of class “ebarraysFamily” extends the class "character" (representing a short hand name for the class) and should have the following slots (for more details see the source code):


A not too long character string describing the family


function that maps user-visible parameters to the parametrization that would be used in the optimization step (e.g. log(sigma^2) for LNN). This allows the user to think in terms of familiar parametrization that may not necessarily be the best when optimizing w.r.t. those parameters.


inverse of the link function


function of a single argument data (matrix containing raw expression values), that calculates and returns as a numeric vector initial estimates of the parameters (in the parametrization used for optimization)


function taking arguments theta and a list called args. f0 calculates the negative log likelihood at the given parameter value theta (again, in the parametrization used for optimization). This is called from emfit. When called, only genes with positive intensities across all samples are used.


f0.pp is essentially the same as f0 except the terms common to the numerator and denominator when calculating posterior odds may be removed. It is called from postprob.


function that takes arguments data, patterns (of class “ebarraysPatterns”) and groupid (for LNNMV family only) and returns a list with two components, common.args and pattern.args. common.args is a list of arguments to f0 that don't change from one pattern to another, whereas pattern.args[[i]][[j]] is a similar list of arguments, but specific to the columns in pattern[[i]][[j]]. Eventually, the two components will be combined for each pattern and used as the args argument to f0.


function of two arguments x (data vector, containing log expressions) and theta (parameters in user-visible parametrization). Returns log marginal density of the natural log of intensity for the corresponding theoretical model. Used in plotMarginal


vector of lower bounds for the argument theta of f0. Used in optim


vector of upper bounds for the argument theta of f0.


Ming Yuan, Ping Wang, Deepayan Sarkar, Michael Newton, and Christina Kendziorski


Newton, M.A., Kendziorski, C.M., Richmond, C.S., Blattner, F.R. (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology 8:37-52.

Kendziorski, C.M., Newton, M.A., Lan, H., Gould, M.N. (2003). On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine 22:3899-3914.

Newton, M.A. and Kendziorski, C.M. Parametric Empirical Bayes Methods for Microarrays in The analysis of gene expression data: methods and software. Eds. G. Parmigiani, E.S. Garrett, R. Irizarry and S.L. Zeger, New York: Springer Verlag, 2003.

Newton, M.A., Noueiry, A., Sarkar, D., and Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture model. Biostatistics 5: 155-176.

Yuan, M. and Kendziorski, C. (2006). A unified approach for simultaneous gene clustering and differential expression identification. Biometrics 62(4): 1089-1098.

See Also

emfit, optim, plotMarginal


comments powered by Disqus