trophicSDM: Fitting a trophic Species distribution model

View source: R/trophicSDM.R

trophicSDMR Documentation

Fitting a trophic Species distribution model

Description

trophicSDM is used to fit a trophic species distribution model. Requires the species distribution data Y (the sites x species matrix), explanatory variables X and a directed acyclic graph G containing species interactions (i.e., the metaweb, with links going from predators to prey). The function fits the distribution of each species as a function of their preys (with mode = "prey", by default) or predators (if set mode = "predator").

Usage

trophicSDM(
  Y,
  X,
  G,
  env.formula = NULL,
  sp.formula = NULL,
  sp.partition = NULL,
  penal = NULL,
  mode = "prey",
  method = "stan_glm",
  family,
  iter = 500,
  chains = 2,
  run.parallel = FALSE,
  verbose = FALSE
)

Arguments

Y

The sites x species matrix containing observed species distribution (e.g. presence-absence).

X

The design matrix, i.e. sites x predictor matrix containing the value of each explanatory variable (e.g. the environmental conditions) at each site.

G

The species interaction network (aka metaweb). Needs to be an igraph object. Links must go from predator to preys. It needs to be a directed acyclic graph.

env.formula

The definition of the abiotic part of the model. It can be :

  • a string specifying the formula (e.g. "~ X_1 + X_2"). In this case, the same environmental variables are used for every species.

  • A list that contains for each species the formula that describes the abiotic part of the model. In this case, different species can be modeled as a function of different environmental covariates. The names of the list must coincide with the names of the species.

sp.formula

(optional) It allows to specify a particular definition of the biotic part of the model, e.g., using composite variables (e.g., richness), or an interaction of the biotic and abitic component. More details in 'Details'.

sp.partition

(optional) a list to specify groups of species that are used to compute composite variables, e.g., a species can be modeled as a function of the richness of each group of preys. It has to be a list, each element is a vector containing the names of species in the group. More details in 'Details'.

penal

Penalisation method to shrink regression coefficients. If NULL (default), the model does not penalise the regression coefficient. For now, available penalization method are "horshoe" for method stan_glm, "elasticnet" for method glm. It is also possible to constrain the sign of biotic coefficients (prey coefficients are set to positive and predator coefficients to negative) by setting "coeff.signs" for methods glm and stan_glm.

mode

"prey" if bottom-up control (default), "predators" otherwise. Notice that G needs to be such that links point from predators to prey.

method

which SDM method to use. For now the available choices are: "glm" (frequentist) or "stan_glm" (full bayesian MCMC, default). Notice that using "glm" does not allow error propagation when predicting.

family

the family parameter of the glm function (see glm). gaussian(link = "identity") for gaussian data. binomial(link = "logit") or binomial(link = "probit") for presence-absence data.

iter

(for "stan_glm" only) Number of iterations for each MCMC chain if stan_glm is used

chains

(for "stan_glm" only) Number of MCMC chains (default to 2)

run.parallel

Whether species models are fitted in parallel (can speed computational up time). Default to FALSE.

verbose

Whether to print algorithm progresses

Details

"sp.formula" and "sp.partition" can be combined to define any kind of composite variables for the biotic part of the formula. "sp.formula" can be :

  • A string defining a formula as function of "richness", e.g., "richness+I(richness)^2" (species are modeled as a function of a quadratic polynomial of their prey richness), "I(richness>0)" (species are modeled as a function of a dummy variable that is equal to 1 when at least one species is present). Importantly, when group of preys (or predators) are specified by "sp.partition", species are modeled as a function of the composite variable specified by "sp.formula" for each of their prey (or predator) groups.

  • A more flexible option is to specify sp.formula as a list (whose names are species' names) that contains for each species the definition of the biotic part of the model. Notice that, in this case, the function does not check that the model is a DAG. This allow to define any kind of composite variable, or to model interactions between environmental covariates and preys (or predators).

Value

A "trophicSDMfit" object, containing:

model

A list containing the local models (i.e. a SDM for each species). Each local model is an object of class "SDMfit". See ?SDMfit for more informations.

Y

A numeric vector of standard errors on parameters

form.all

A list describing each species formula (both biotic and abiotic terms)

data

A list containing all the data used to fit the model

model.call

A list containing the modeling choices of the fitted model (e.g. method, penalisation...)

coef

A list containing, for each species, the inferred coefficients (with credible intervals or p-values when available)

MCMC.diag

MCMC convergence metrics, only available for MCMC methods

AIC

Model's AIC

log.lik

Model's log.likelihood

Author(s)

Giovanni Poggiato and Jérémy Andréoletti

Examples

data(Y, X, G)
# define abiotic part of the model
env.formula = "~ X_1 + X_2"
# Run the model with bottom-up control using stan_glm as fitting method and no penalisation
# Increase the number of iterations to obtain reliable results.
m = trophicSDM(Y,X,G, env.formula, iter = 50,
               family = binomial(link = "logit"), penal = NULL, 
               mode = "prey", method = "stan_glm")
print(m)

# Access local models (e.g. species "Y5")
m$model$Y5
coef(m$model$Y5)
# The fitted model can be plotted with `plot(m)`

# Fit a sparse model in the Bayesian framework with the horshoe prior

m = trophicSDM(Y,X,G, env.formula, 
               family = binomial(link = "logit"), penal = "horshoe", 
               mode = "prey", method = "stan_glm")

# Fit frequentist glm
m = trophicSDM(Y,X,G, env.formula, 
               family = binomial(link = "logit"), penal = NULL, 
               mode = "prey", method = "glm")
               
# With elasticnet penalty   
m = trophicSDM(Y,X,G, env.formula, 
               family = binomial(link = "logit"), penal = "elasticnet", 
               mode = "prey", method = "glm")

#### Composite variables
# See vignette 'Composite variables' for a complete introduction to the use of composite variables
# Model species as a function of a quadratic polynomial of prey richness
m = trophicSDM(Y,X,G, env.formula, 
               family = binomial(link = "logit"), penal = NULL, 
               sp.formula = "richness + I(richness^2)",
               mode = "prey", method = "glm")
m$form.all
# Notice that for predators that feed on a single prey (with presence-absence data),
# their richness and the square of their richness is exactly the same variable
# In this case, `trophicSDM()` removes the redundant variable but prints a warning message

# Model species as a function of a dummy variable saying whether they have at leaste one prey
m = trophicSDM(Y,X,G, env.formula, 
               family = binomial(link = "logit"), penal = NULL, 
               sp.formula = "I(richness>0)",
               mode = "prey", method = "glm")
m$form.all

# Define group of preys and model species as a function of the richness (with a quadratic term)
# of these groups of preys separately

# Species Y1 and Y2 belong to the same group, species Y3 and Y4 are both alone in their group and 
# species Y5 and Y6 form another group
sp.partition = list(c("Y1","Y2"),c("Y3"),c("Y4"), c("Y5","Y6"))

m = trophicSDM(Y,X,G, env.formula, 
               family = binomial(link = "logit"), penal = NULL, 
               sp.partition = sp.partition,
               sp.formula = "richness + I(richness^2)",
               mode = "prey", method = "glm")
m$form.all


webSDM documentation built on June 24, 2024, 5:13 p.m.