MRFcov | R Documentation |
This function is the workhorse of the MRFcov
package, running
separate penalized regressions for each node to estimate parameters of
Markov Random Fields (MRF) graphs. Covariates can be included
(a class of models known as Conditional Random Fields; CRF), to estimate
how interactions between nodes vary across covariate magnitudes.
MRFcov(
data,
symmetrise,
prep_covariates,
n_nodes,
n_cores,
n_covariates,
family,
bootstrap = FALSE,
progress_bar = FALSE
)
data |
A |
symmetrise |
The method to use for symmetrising corresponding parameter estimates
(which are taken from separate regressions). Options are |
prep_covariates |
Logical. If |
n_nodes |
Positive integer. The index of the last column in |
n_cores |
Positive integer. The number of cores to spread the job across using
|
n_covariates |
Positive integer. The number of covariates in |
family |
The response type. Responses can be quantitative continuous ( |
bootstrap |
Logical. Used by |
progress_bar |
Logical. Progress bar in pbapply is used if |
Separate penalized regressions are used to approximate
MRF parameters, where the regression for node j
includes an
intercept and coefficients for the abundance (families gaussian
or poisson
)
or presence-absence (family binomial
) of all other
nodes (/j
) in data
. If covariates are included, coefficients
are also estimated for the effect of the covariate on j
, and for the
effects of the covariate on interactions between j
and all other nodes
(/j
). Note that interaction coefficients must be estimated between variables that
are on roughly the same scale, as the resulting parameter estimates are
unified into a Markov Random Field using the specified symmetrise
function.
Counts for poisson
variables, which are often not on the same scale,
will therefore be normalised with a nonparanormal transformation
x = qnorm(rank(log2(x + 0.01)) / (length(x) + 1))
. These transformed counts
will be used in a (family = "gaussian")
model and their respective raw distribution parameters returned so that coefficients
can be back-transformed for interpretation (this back-transformation is
performed automatatically by other functions including predict_MRF
and cv_MRF_diag
). Gaussian variables are not automatically transformed, so
if they cover quite different ranges and scales, then it is recommended to scale them prior to fitting
models. For more information on this process, use
vignette("Gaussian_Poisson_CRFs")
Note that since the number of parameters to estimate in each node-wise regression
quickly increases with increasing numbers of nodes and covariates,
LASSO penalization is used to regularize
regressions. This is done by minimising the cross-validated
mean error for each node separately using cv.glmnet
. In this way,
we maximise the log-likelihood of each node
separately before unifying the nodes into a graph.
A list
containing:
graph
: Estimated parameter matrix
of pairwise interaction effects
intercepts
: Estimated parameter vector
of node intercepts
indirect_coefs
: list
containing matrices representing
indirect effects of each covariate on pairwise node interactions
direct_coefs
: matrix
of direct effects of each parameter on
each outcome node. For family = 'binomial'
models, all coefficients are
estimated on the logit scale.
param_names
: Character string of covariate parameter names
mod_type
: A character stating the type of model that was fit
(used in other functions)
mod_family
: A character stating the family of model that was fit
(used in other functions)
poiss_sc_factors
: A matrix of the estimated negative binomial or
poisson parameters for each raw node variable (only returned if family = "poisson"
).
These are needed for converting coefficients back to their original distribution, and are
used for prediction purposes only
Ising, E. (1925). Beitrag zur Theorie des Ferromagnetismus.
Zeitschrift für Physik A Hadrons and Nuclei, 31, 253-258.
Cheng, J., Levina, E., Wang, P. & Zhu, J. (2014).
A sparse Ising model with covariates. (2012). Biometrics, 70, 943-953.
Clark, NJ, Wells, K and Lindberg, O.
Unravelling changing interspecific interactions across environmental gradients
using Markov random fields. (2018). Ecology doi: 10.1002/ecy.2221
Full text here.
Sutton C, McCallum A. An introduction to conditional random fields.
Foundations and Trends in Machine Learning 4, 267-373.
Cheng et al. (2014), Sutton & McCallum (2012) and Clark et al. (2018)
for overviews of Conditional Random Fields. See cv.glmnet
for
details of cross-validated optimization using LASSO penalty. Worked examples to showcase
this function can be found using vignette("Bird_Parasite_CRF")
and
vignette("Gaussian_Poisson_CRFs")
data("Bird.parasites")
CRFmod <- MRFcov(data = Bird.parasites, n_nodes = 4, family = 'binomial')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.