The pulsar package

Share:

Description

Graphical model selection with the pulsar package

Details

This package provides methods to select a sparse, undirected graphical model by choosing a penalty parameter (lambda or λ) among a list of ordered values of lambda. We use an implementation of the Stability Approach to Regularization Selection (StARS, see references) inspired by the huge package.

However, pulsar includes some major differences from other R packages for graphical model estimation and selection (glasso, huge, QUIC, XMRF, clime, flare, etc). The underlying graphical model is computed by passing a function as an argument to pulsar. Thus, any algorithm for penalized graphical models can be used in this framework (see pulsar-function for more details), including those from the above packages. pulsar brings computational experiments under one roof by separating subsampling and calculation of summary criteria from the user-specified core model. The typical workflow in pulsar is to perform subsampling first (via the pulsar) and then refit the model on the full dataset using refit.

Previous StARS implementations can be inefficient for large graphs or when many subsamples are required. pulsar can compute upper and lower bounds on the regularization path for the StARS criterion after only 2 subsamples which makes it possible to neglect lambda values that are far from the desired StARS regularization parameter, reducing computation time for the rest of the N-2 subsamples (Bounded StARS (B-StARS)).

We also implement additional subsampling-based graph summary criteria which can be used for more informed model selection. For example, we have shown that induced subgraph (graphlet) stability (G-StARS) improves empirical performance over StARS but other criteria are also offered.

Subsampling amounts to running the specified core model for N independent computations. Using the BatchJobs framework, we provide a simple wrapper, batch.pulsar, for running pulsar in embarrassingly parallel mode in an hpc environment. Summary criteria are computed using a Map/Reduce strategy, which lowers memory footprint for large models.

References

Müller, C. L., Bonneau, R. A., & Kurtz, Z. D. (2016).Generalized Stability Approach for Regularized Graphical Models.arXiv: http://arxiv.org/abs/1605.07072.

See Also

pulsar-function, pulsar, batch.pulsar