Description Details Demos available Author(s) References
The R
package simone implements the inference of
coexpression networks based on partial correlation coefficients from
either steadystate or timecourse transcriptomic data. Note that with
both type of data this package can deal with samples collected in
different experimental conditions and therefore not identically
distributed. In this particular case, multiple but related graphs are
inferred at once.
The underlying statistical tools enter the framework of Gaussian graphical models (GGM). Basically, the algorithm searches for a latent clustering of the network to drive the selection of edges through an adaptive l1penalization of the model likelihood.
The available inference methods for edges selection and/or estimation include
as in Meinshausen and Buhlman (2006), steadystate data only;
as in Banerjee et al, 2008 and Friedman et al (2008), steadystate data only;
as in Charbonnier, Chiquet and Ambroise (2010), timecourse data only;
as in Chiquet, Grandvalet and Ambroise (preprint), both timecourse and steadystate data.
All the listed methods are l1norm based penalization, with an additional grouping effect for multitask learning (including three variants: "intertwined", "groupLasso" and "cooperativeLasso").
The penalization of each individual edge may be weighted according to
a latent clustering of the network, thus adapting the inference of the
network to a particular topology. The clustering algorithm is
performed by the mixer
package, based upon Daudin, Picard and
Robin (2008)'s Mixture Model for Random Graphs.
Since the choice of the network sparsity level remains a current issue in the framework of sparse Gaussian network inference, the algorithm provides a full path of estimates starting from an empty network and adding edges as the penalty level progressively decreases. Bayesian Information Criteria (BIC) and Akaike Information Criteria (AIC) are adapted to the GGM context in order to help to choose one particular network among this path of solutions.
Graphical tools are provided to summarize the results of a
simone
run and offer various representations for network
plotting.
Index:
1 2 3 4 5 6 7 8 9 10  cancer Microarray data set for breast cancer
coNetwork Random perturbations of a reference network
getNetwork Network extraction from a SIMoNe run
plot.simone Graphical representation of SIMoNe outputs
plot.simone.network Graphical representation of a network
rNetwork Simulation of (clustered) Gaussian networks
rTranscriptData Simulation of artificial transcriptomic data
setOptions Lowlevel options of the 'simone' function
simone SIMoNe algorithm for network inference

Beyond the examples of this manual, a good starting point is to have a
look at the scripts available via demo(package="simone")
. They
make use of simone
, main function in the package, in various
contexts (steadystate or timecourse data, multiple sample
learning). All these scripts also illustrate the use of the different
plot functions.
demo(cancer_multitask)
example on the cancer
data set of the multitask approach
with a cooperativeLasso grouping effect across tasks. Patient
responses to the chemiotherapy (pCR or notpCR) split the data set
into two distinct samples. Network inference is performed jointly
on these samples and graphical comparison is made between the two
networks.
demo(cancer_pooled)
example on the cancer
data set which is designed to compare
network inference when a clustering prior is used or
not. Graphical comparison between the two inferred networks
(with/without clustering prior) illustrates how inference is
driven to a particular network topology when clustering is
relevant (here, an affiliation structure).
demo(check_glasso, echo=FALSE)
example that basically checks the consistency between the glasso package of Friedman et al and the simone package to solve the l1penalized Gaussian likelihood criterion suggested by Banerjee et al in the n>p settings. In the n<p settings, simone provides sparser solutions than the glasso package since the underlying Lasso problems are solved with an active set algorithm instead of the shooting/pathwise coordinate algorithm.
demo(simone_multitask)
example of multitask learning on simulated, steadystate data: two
networks are generated by randomly perturbing a common ancestor
with the coNetwork
function. These two networks are then
used to generate two multivariate Gaussian samples. Multitask
learning is applied and a simple illustration of the use of the
setOptions
function is given.
demo(simone_steadyState)
example of how to learn a single network from steadystate data. A
sample is first generated with the rNetwork
and
rTranscriptData
functions. Then the path of solutions of
the neighborhood selection method (default for single task
steadystate data) is computed.
demo(simone_timeCourse)
example of how to learn a single network from timecourse data. A
sample is first generated with the rNetwork
and
rTranscriptData
functions and the path of solutions of the
VAR(1) inference method is computed, with and without
clustering prior.
Julien Chiquet julien.chiquet@genopole.cnrs.fr,
Gilles Grasseau gilles.grasseau@genopole.cnrs.fr,
Camille Charbonnier camille.charbonnier@genopole.cnrs.fr,
Christophe Ambroise christophe.ambroise@genopole.cnrs.fr.
J. Chiquet, Y. Grandvalet, and C. Ambroise (preprint). Inferring multiple graphical structures. preprint available on ArXiv. http://arxiv.org/abs/0912.4434.
C. Charbonnier, J. Chiquet, and C. Ambroise (2010). WeightedLasso for Structured Network Inference from Time Course Data. Statistical Applications in Genetics and Molecular Biology, vol. 9, iss. 1, article 15. http://www.bepress.com/sagmb/vol9/iss1/art15/
C. Ambroise, J. Chiquet, and C. Matias (2009). Inferring sparse Gaussian graphical models with latent structure. Electronic Journal of Statistics, vol. 3, pp. 205–238. http://dx.doi.org/10.1214/08EJS314
O. Banerjee, L. El Ghaoui, A. d'Aspremont (2008). Model Selection Through Sparse Maximum Likelihood Estimation. Journal of Machine Learning Research, vol. 9, pp. 485–516. http://www.jmlr.org/papers/volume9/banerjee08a/banerjee08a.pdf
J. Friedman, T. Hastie and R. Tibshirani (2008). Sparse inverse covariance estimation with the graphical Lasso. Biostatistics, vol. 9(3), pp. 432–441. http://wwwstat.stanford.edu/~tibs/ftp/graph.pdf
N. Meinshausen and P. Buhlmann (2006). Highdimensional graphs and variable selection with the Lasso. The Annals of Statistics, vol. 34(3), pp. 1436–1462. http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?view=body&id=pdfview_1&handle=euclid.aos/1152540754
J.J. Daudin, F.Picard and S. Robin, S. (2008). Mixture model for random graphs. Statistics and Computing, vol. 18(2), pp. 173–183. http://www.springerlink.com/content/9v6846342mu82x42/fulltext.pdf
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.