prep_dat_pglmm: Prepare data for 'pglmm'

Description Usage Arguments Value

View source: R/pglmm-utils.R


This function is mainly used within pglmm but can also be used independently to prepare a list of random effects, which then can be updated by users for more complex models.


  cov_ranef = NULL,
  repulsion = FALSE, = TRUE,
  family = "gaussian", = TRUE,
  bayes = FALSE,
  bayes_nested_matrix_as_list = FALSE



A two-sided linear formula object describing the mixed effects of the model.

To specify that a random term should have phylogenetic covariance matrix along with non-phylogenetic one, add __ (two underscores) at the end of the group variable; e.g., + (1 | sp__) will construct two random terms, one with phylogenetic covariance matrix and another with non-phylogenetic (identity) matrix. In contrast, __ in the nested terms (below) will only create a phylogenetic covariance matrix. Nested random terms have the general form (1|sp__@site__) which represents phylogenetically related species nested within correlated sites. This form can be used for bipartite questions. For example, species could be phylogenetically related pollinators and sites could be phylogenetically related plants, leading to the random effect (1|insects__@plants__). If more than one phylogeny is used, remember to add all to the argument cov_ranef = list(insects = insect_phylo, plants = plant_phylo). Phylogenetic correlations can be dropped by removing the __ underscores. Thus, the form (1|sp@site__) excludes the phylogenetic correlations among species, while the form (1|sp__@site) excludes the correlations among sites.

Note that correlated random terms are not allowed. For example, (x|g) will be the same as (0 + x|g) in the lme4::lmer syntax. However, (x1 + x2|g) won't work, so instead use (x1|g) + (x2|g).


A data.frame containing the variables named in formula.


A named list of covariance matrices of random terms. The names should be the group variables that are used as random terms with specified covariance matrices (without the two underscores, e.g. list(sp = tree1, site = tree2)). The actual object can be either a phylogeny with class "phylo" or a prepared covariance matrix. If it is a phylogeny, pglmm will prune it and then convert it to a covariance matrix assuming Brownian motion evolution. pglmm will also standardize all covariance matrices to have determinant of one. Group variables will be converted to factors and all covariance matrices will be rearranged so that rows and columns are in the same order as the levels of their corresponding group variables.


When there are nested random terms specified, repulsion = FALSE tests for phylogenetic underdispersion while repulsion = FALSE tests for overdispersion. This argument is a logical vector of length either 1 or >1. If its length is 1, then all covariance matrices in nested terms will be either inverted (overdispersion) or not. If its length is >1, then you can select which covariance matrix in the nested terms to be inverted. Make sure to get the length right: for all the terms with @, count the number of "__" to determine the length of repulsion. For example, sp__@site and sp@site__ will each require one element of repulsion, while sp__@site__ will take two elements (repulsion for sp and repulsion for site). Therefore, if your nested terms are (1|sp__@site) + (1|sp@site__) + (1|sp__@site__), then you should set the repulsion to be something like c(TRUE, FALSE, TRUE, TRUE) (length of 4).

Whether to prepare random effects for users.


Either "gaussian" for a Linear Mixed Model, or "binomial" or "poisson" for Generalized Linear Mixed Models. "family" should be specified as a character string (i.e., quoted). For binomial and Poisson data, we use the canonical logit and log link functions, respectively. Binomial data can be either presence/absence, or a two-column array of 'successes' and 'failures'. For both binomial and Poisson data, we add an observation-level random term by default via = TRUE. If bayes = TRUE there are two additional families available: "zeroinflated.binomial", and "zeroinflated.poisson", which add a zero inflation parameter; this parameter gives the probability that the response is a zero. The rest of the parameters of the model then reflect the "non-zero" part part of the model. Note that "zeroinflated.binomial" only makes sense for success/failure response data.

Whether to add an observation-level random term for binomial or Poisson distributions. Normally it would be a good idea to add this to account for overdispersion, so = TRUE by default.


Whether to fit a Bayesian version of the PGLMM using r-inla.


For bayes = TRUE, prepare the nested terms as a list of length of 4 as the old way?


A list with updated formula, random.effects, and updated cov_ranef.

phyr documentation built on Jan. 13, 2021, 5:40 p.m.