Home

/

GitHub

/

nicholasjclark/MRFcov

/

MRFcov: Markov Random Fields with covariates

MRFcov: Markov Random Fields with covariates
In nicholasjclark/MRFcov: Markov Random Fields with Additional Covariates

View source: R/MRFcov.R

MRFcov

R Documentation

Markov Random Fields with covariates

Description

This function is the workhorse of the MRFcov package, running separate penalized regressions for each node to estimate parameters of Markov Random Fields (MRF) graphs. Covariates can be included (a class of models known as Conditional Random Fields; CRF), to estimate how interactions between nodes vary across covariate magnitudes.

Usage

MRFcov(
  data,
  symmetrise,
  prep_covariates,
  n_nodes,
  n_cores,
  n_covariates,
  family,
  bootstrap = FALSE,
  progress_bar = FALSE
)

Arguments

`data`	A `dataframe`. The input data where the `n_nodes` left-most variables are variables that are to be represented by nodes in the graph
`symmetrise`	The method to use for symmetrising corresponding parameter estimates (which are taken from separate regressions). Options are `min` (take the coefficient with the smallest absolute value), `max` (take the coefficient with the largest absolute value) or `mean` (take the mean of the two coefficients). Default is `mean`
`prep_covariates`	Logical. If `TRUE`, covariate columns will be cross-multiplied with nodes to prep the dataset for MRF models. Note this is only useful when additional covariates are provided. Therefore, if `n_nodes < NCOL(data)`, default is `TRUE`. Otherwise, default is `FALSE`. See `prep_MRF_covariates` for more information
`n_nodes`	Positive integer. The index of the last column in `data` which is represented by a node in the final graph. Columns with index greater than n_nodes are taken as covariates. Default is the number of columns in `data`, corresponding to no additional covariates
`n_cores`	Positive integer. The number of cores to spread the job across using `makePSOCKcluster`. Default is 1 (no parallelisation)
`n_covariates`	Positive integer. The number of covariates in `data`, before cross-multiplication. Default is `NCOL(data) - n_nodes`
`family`	The response type. Responses can be quantitative continuous (`family = "gaussian"`), non-negative counts (`family = "poisson"`) or binomial 1s and 0s (`family = "binomial"`). If using (`family = "binomial"`), please note that if nodes occur in less than 5 percent of observations this can make it generally difficult to estimate occurrence probabilities (on the extreme end, this can result in intercept-only models being fitted for the nodes in question). The function will issue a warning in this case. If nodes occur in more than 95 percent of observations, this will return an error as the cross-validation step will generally be unable to proceed. For `family = 'poisson'` models, all returned coefficients are estimated on the identity scale AFTER using a nonparanormal transformation. See `vignette("Gaussian_Poisson_CRFs")` for details of interpretation
`bootstrap`	Logical. Used by `bootstrap_MRF` to reduce memory usage
`progress_bar`	Logical. Progress bar in pbapply is used if `TRUE`, but this slows estimation.

Details

Separate penalized regressions are used to approximate MRF parameters, where the regression for node j includes an intercept and coefficients for the abundance (families gaussian or poisson) or presence-absence (family binomial) of all other nodes (/j) in data. If covariates are included, coefficients are also estimated for the effect of the covariate on j, and for the effects of the covariate on interactions between j and all other nodes (/j). Note that interaction coefficients must be estimated between variables that are on roughly the same scale, as the resulting parameter estimates are unified into a Markov Random Field using the specified symmetrise function. Counts for poisson variables, which are often not on the same scale, will therefore be normalised with a nonparanormal transformation x = qnorm(rank(log2(x + 0.01)) / (length(x) + 1)). These transformed counts will be used in a (family = "gaussian") model and their respective raw distribution parameters returned so that coefficients can be back-transformed for interpretation (this back-transformation is performed automatatically by other functions including predict_MRF and cv_MRF_diag). Gaussian variables are not automatically transformed, so if they cover quite different ranges and scales, then it is recommended to scale them prior to fitting models. For more information on this process, use vignette("Gaussian_Poisson_CRFs")

Note that since the number of parameters to estimate in each node-wise regression quickly increases with increasing numbers of nodes and covariates, LASSO penalization is used to regularize regressions. This is done by minimising the cross-validated mean error for each node separately using cv.glmnet. In this way, we maximise the log-likelihood of each node separately before unifying the nodes into a graph.

Value

A list containing:

graph: Estimated parameter matrix of pairwise interaction effects
intercepts: Estimated parameter vector of node intercepts
indirect_coefs: list containing matrices representing indirect effects of each covariate on pairwise node interactions
direct_coefs: matrix of direct effects of each parameter on each outcome node. For family = 'binomial' models, all coefficients are estimated on the logit scale.
param_names: Character string of covariate parameter names
mod_type: A character stating the type of model that was fit (used in other functions)
mod_family: A character stating the family of model that was fit (used in other functions)
poiss_sc_factors: A matrix of the estimated negative binomial or poisson parameters for each raw node variable (only returned if family = "poisson"). These are needed for converting coefficients back to their original distribution, and are used for prediction purposes only

References

Ising, E. (1925). Beitrag zur Theorie des Ferromagnetismus. Zeitschrift für Physik A Hadrons and Nuclei, 31, 253-258.

Cheng, J., Levina, E., Wang, P. & Zhu, J. (2014). A sparse Ising model with covariates. (2012). Biometrics, 70, 943-953.

Clark, NJ, Wells, K and Lindberg, O. Unravelling changing interspecific interactions across environmental gradients using Markov random fields. (2018). Ecology doi: 10.1002/ecy.2221 Full text here.

Sutton C, McCallum A. An introduction to conditional random fields. Foundations and Trends in Machine Learning 4, 267-373.

Examples

data("Bird.parasites")
CRFmod <- MRFcov(data = Bird.parasites, n_nodes = 4, family = 'binomial')

nicholasjclark/MRFcov documentation built on March 30, 2024, 10:31 p.m.

nicholasjclark/MRFcov index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

nicholasjclark/MRFcov
Markov Random Fields with Additional Covariates

MRFcov: Markov Random Fields with covariates
In nicholasjclark/MRFcov: Markov Random Fields with Additional Covariates

Markov Random Fields with covariates

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to MRFcov in nicholasjclark/MRFcov...

R Package Documentation

Browse R Packages

We want your feedback!

nicholasjclark/MRFcov Markov Random Fields with Additional Covariates

MRFcov: Markov Random Fields with covariates In nicholasjclark/MRFcov: Markov Random Fields with Additional Covariates

Markov Random Fields with covariates

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to MRFcov in nicholasjclark/MRFcov...

R Package Documentation

Browse R Packages

We want your feedback!

nicholasjclark/MRFcov
Markov Random Fields with Additional Covariates

MRFcov: Markov Random Fields with covariates
In nicholasjclark/MRFcov: Markov Random Fields with Additional Covariates