Hmsc: Hmsc
In Hmsc: Hierarchical Model of Species Communities

View source: R/Hmsc.R

Hmsc	R Documentation

Hmsc

Description

Creates an Hmsc-class object

Usage

Hmsc(
  Y,
  XFormula = ~.,
  XData = NULL,
  X = NULL,
  XScale = TRUE,
  XSelect = NULL,
  XRRRData = NULL,
  XRRRFormula = ~. - 1,
  XRRR = NULL,
  ncRRR = 2,
  XRRRScale = TRUE,
  YScale = FALSE,
  studyDesign = NULL,
  ranLevels = NULL,
  ranLevelsUsed = names(ranLevels),
  TrFormula = NULL,
  TrData = NULL,
  Tr = NULL,
  TrScale = TRUE,
  phyloTree = NULL,
  C = NULL,
  distr = "normal",
  truncateNumberOfFactors = TRUE
)

Arguments

`Y`	a matrix of species occurences or abundances
`XFormula`	a `formula`-class object for fixed effects (linear regression)
`XData`	a data frame of measured covariates for fixed effects with `formula`-based specification
`X`	a matrix of measured covariates for fixed effects with direct specification
`XScale`	a boolean flag indicating whether to scale covariates for the fixed effects
`XSelect`	a list describing how variable selection is to be applied
`XRRRData`	a data frame of covariates for reduced-rank regression
`XRRRFormula`	`formula` for reduced-rank regression
`XRRR`	a matrix of covariates for reduced-rank regression
`ncRRR`	number of covariates (linear combinations) for reduced-rank regression
`XRRRScale`	a boolean flag indicating whether to scale covariates for reduced-rank regression
`YScale`	a boolean flag whether to scale responses for which normal distribution is assumed
`studyDesign`	a data frame of correspondence between sampling units and units on different levels of latent factors
`ranLevels`	a named list of `HmscRandomLevel`-class objects, specifying the structure and data for random levels
`ranLevelsUsed`	a vector with names of levels of latent factors that are used in the analysis
`TrFormula`	a `formula`-class object for regression dependence of β_{kj} coefficients on species traits
`TrData`	a data frame of measured species traits for `formula`-based specification
`Tr`	a matrix of measured traits for direct specification
`TrScale`	a boolean flag whether to scale values of species traits
`phyloTree`	a phylogenetic tree (object of class `phylo` or `corPhyl`) for species in `Y`
`C`	a phylogenic correlation matrix for species in `Y`
`distr`	a string shortcut or n_s \times 2 matrix specifying the observation models
`truncateNumberOfFactors`	logical, reduces the maximal number of latent factor to be at most the number of species

Details

Matrix Y may contain missing values, but it is not recommended to add a species/sampling unit with fully missing data, since those do not bring any new additional information.

Only one of XFormula-XData and X arguments can be specified. Similar requirement applies to TrFormula-TrData and Tr. It is recommended to use the specification with formula, since that information enables additional features for postprocessing of the fitted model.

As default, scaling is applied for X and Tr matrices, but not for Y matrix. If the X and/or Tr matrices are scaled, the estimated parameters are back-transformed so that the estimated parameters correspond to the original X and Tr matrices, not the scaled ones. In contrast, if Y is scaled, the estimated parameters are not back-transformed because doing so is not possible for all model parameters. Thus, the estimated parameters correspond to the scaled Y matrix, not the original one. If the Y matrix is scaled, the predictions generated by predict are back-transformed, so that the predicted Y matrices are directly comparable to the original Y matrix. If default priors are assumed, it is recommended that all matrices (X, Tr and Y) are scaled.

The object XSelect is a list. Each object of the list Xsel = XSelect[[i]] is a named list with objects Xsel$covGroup, Xsel$spGroup and Xsel$q. The parameter covGroup is a vector containing the columns of the matrix X for which variable selection is applied. The parameter spGroup is a vector of length equal to the number of species ns, with values 1,...,ng, where ng is the number of groups of species for which variable selection is applied simultanously. The parameter q is a vector of length ng, containing the prior probabilities by which the variables are to be included. For example, choosing covGroup = c(2,3), spGroup = rep(1,ns) and q=0.1 either includes or excludes both of the covariates 2 and 3 simultaneously for all species. For another example, choosing covGroup = c(2,3), spGroup = 1:ns and q=rep(0.1,ns) either includes or excludes both of the covariates 2 and 3 separately for each species.

The included random levels are specified by the ranLevels and ranLevelsUsed arguments. The correspondence between units of each random level and rows of Y must be specified by a column of studyDesign, which corresponds to the name of a list item in ranLevels. It is possible to provide an arbitrary number of columns in studyDesign that are not listed in ranLevels. These do not affect the model formulation or fitting scheme, but can be utilized during certain functions postprocessing the results of statistical model fit.

The distr argument may be either a matrix, a string literal, or a vector of string literals. In the case of a matrix, the dimension must be n_s \times 2, where the first column defines the family of the observation model and the second argument defines the dispersion property. The elements of the first column must take values 1-normal, 2-probit and 3-Poisson with log link function. The second argument stands for the dispersion parameter being fixed (0) or estimated (1). The default fixed values of the dispersion parameters are 1 for normal and probit, and 0.01 for Poisson (implemented as a limiting case of lognormally-overdispersed Poisson). Alternatively, a string literal shortcut can be given as a value to the distr argument, simultaniously specifying similar class of observation models for all species. The available shortcuts are "normal", "probit", "poisson", "lognormal poisson". If distr is a vector of string literals, each element corresponds to one species, should be either "normal", "probit", "poisson", "lognormal poisson", and these can be abbreviated as long as they are unique strings. The matrix argument and the vector of string literals allows specifying different observation models for different species.

By default this constructor assigns default priors to the latent factors. Those priors are designed to be reasonably flat assuming that the covariates, species traits and normally distributed responses are scaled. In case when other priors needed to be specified, a call of setPriors.Hmsc methods should be made, where the particular priors may be specified.

Value

An object of Hmsc class without any posterior samples.

Examples

# Creating a Hmsc object without phylogeny, trait data or random levels
m = Hmsc(Y=TD$Y, XData=TD$X, XFormula=~x1+x2)

# Creating a Hmsc object with phylogeny and traits
m = Hmsc(Y=TD$Y, XData=TD$X, XFormula=~x1+x2,
TrData=TD$Tr, TrFormula=~T1+T2, phyloTree=TD$phylo)

# Creating a Hmsc object with 2 nested random levels (50 sampling units in 20 plots)
studyDesign = data.frame(sample = as.factor(1:50), plot = as.factor(sample(1:20,50,replace=TRUE)))
rL1 = HmscRandomLevel(units=levels(TD$studyDesign$plot))
rL2 = HmscRandomLevel(units=levels(TD$studyDesign$sample))
m = Hmsc(Y=TD$Y, XData=TD$X, XFormula=~x1+x2,
studyDesign=studyDesign,ranLevels=list("sample"=rL1,"plot"=rL2))

Hmsc documentation built on Aug. 11, 2022, 5:11 p.m.