Description Usage Arguments Details Value Author(s) References Examples
This function performs Generalized Linear Mixed Models for binary, count,
and continuous data, estimating regression coefficients with
approximate standard errors. It is specifically designed for community data
in which species occur within multiple sites (locations).
A Bayesian version of PGLMM uses the package INLA
,
which is not available on CRAN yet. If you wish to use this option,
you must first install INLA
from https://www.rinla.org/ by running
install.packages('INLA', repos='https://www.math.ntnu.no/inla/R/stable')
in R.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67  pglmm(
formula,
data = NULL,
family = "gaussian",
cov_ranef = NULL,
random.effects = NULL,
REML = TRUE,
optimizer = c("neldermeadnlopt", "bobyqa", "NelderMead", "subplex"),
repulsion = FALSE,
add.obs.re = TRUE,
verbose = FALSE,
cpp = TRUE,
bayes = FALSE,
s2.init = NULL,
B.init = NULL,
reltol = 10^6,
maxit = 500,
tol.pql = 10^6,
maxit.pql = 200,
marginal.summ = "mean",
calc.DIC = TRUE,
calc.WAIC = TRUE,
prior = "inla.default",
prior_alpha = 0.1,
prior_mu = 1,
ML.init = FALSE,
tree = NULL,
tree_site = NULL,
sp = NULL,
site = NULL,
bayes_options = NULL,
bayes_nested_matrix_as_list = FALSE
)
communityPGLMM(
formula,
data = NULL,
family = "gaussian",
cov_ranef = NULL,
random.effects = NULL,
REML = TRUE,
optimizer = c("neldermeadnlopt", "bobyqa", "NelderMead", "subplex"),
repulsion = FALSE,
add.obs.re = TRUE,
verbose = FALSE,
cpp = TRUE,
bayes = FALSE,
s2.init = NULL,
B.init = NULL,
reltol = 10^6,
maxit = 500,
tol.pql = 10^6,
maxit.pql = 200,
marginal.summ = "mean",
calc.DIC = TRUE,
calc.WAIC = TRUE,
prior = "inla.default",
prior_alpha = 0.1,
prior_mu = 1,
ML.init = FALSE,
tree = NULL,
tree_site = NULL,
sp = NULL,
site = NULL,
bayes_options = NULL,
bayes_nested_matrix_as_list = FALSE
)

formula 
A twosided linear formula object describing the mixed effects of the model. To specify that a random term should have phylogenetic covariance matrix along
with nonphylogenetic one, add Note that correlated random terms are not allowed. For example,

data 
A 
family 
Either "gaussian" for a Linear Mixed Model, or
"binomial" or "poisson" for Generalized Linear Mixed Models.
"family" should be specified as a character string (i.e., quoted). For binomial and
Poisson data, we use the canonical logit and log link functions, respectively.
Binomial data can be either presence/absence, or a twocolumn array of 'successes' and 'failures'.
For both binomial and Poisson data, we add an observationlevel
random term by default via 
cov_ranef 
A named list of covariance matrices of random terms. The names should be the
group variables that are used as random terms with specified covariance matrices
(without the two underscores, e.g. 
random.effects 
Optional prebuild list of random effects. If 
REML 
Whether REML or ML is used for model fitting the random effects. Ignored if

optimizer 
neldermeadnlopt (default), bobyqa, NelderMead, or subplex.
NelderMead is from the stats package and the other optimizers are from the nloptr package.
Ignored if 
repulsion 
When there are nested random terms specified, 
add.obs.re 
Whether to add an observationlevel random term for binomial or Poisson
distributions. Normally it would be a good idea to add this to account for overdispersion,
so 
verbose 
If 
cpp 
Whether to use C++ function for optim. Default is TRUE. Ignored if 
bayes 
Whether to fit a Bayesian version of the PGLMM using 
s2.init 
An array of initial estimates of s2 for each random
effect that scales the variance. If s2.init is not provided for

B.init 
Initial estimates of B, a matrix containing
regression coefficients in the model for the fixed effects. This
matrix must have 
reltol 
A control parameter dictating the relative tolerance
for convergence in the optimization; see 
maxit 
A control parameter dictating the maximum number of
iterations in the optimization; see 
tol.pql 
A control parameter dictating the tolerance for
convergence in the PQL estimates of the mean components of the
GLMM. Ignored if 
maxit.pql 
A control parameter dictating the maximum number
of iterations in the PQL estimates of the mean components of the
GLMM. Ignored if 
marginal.summ 
Summary statistic to use for the estimate of coefficients when
doing a Bayesian PGLMM (when 
calc.DIC 
Should the Deviance Information Criterion be calculated and returned
when doing a Bayesian PGLMM? Ignored if 
calc.WAIC 
Should the WAIC be calculated and returned
when doing a Bayesian PGLMM? Ignored if 
prior 
Which type of default prior should be used by 
prior_alpha 
Only used if 
prior_mu 
Only used if 
ML.init 
Only relevant if 
tree 
A phylogeny for column sp, with "phylo" class, or a covariance matrix for sp.
Make sure to have all species in the matrix; if the matrix is not standardized,
(i.e., det(tree) != 1), 
tree_site 
A second phylogeny for "site". This is required only if the site column contains species instead of sites. This can be used for bipartitie questions; tree_site can also be a covariance matrix. Make sure to have all sites in the matrix; if the matrix is not standardized (i.e., det(tree_site) != 1), pglmm' will try to standardize it for you. No longer used: keep here for compatibility. 
sp 
No longer used: keep here for compatibility. 
site 
No longer used: keep here for compatibility. 
bayes_options 
Additional options to pass to INLA for if 
bayes_nested_matrix_as_list 
For 
For Gaussian data, pglmm
analyzes the phylogenetic linear mixed model
Y = beta_0 + beta_1x + b_0 + b_1x
b_0 ~ Gaussian(0, sigma_0^2I_(sp))
b_0 ~ Gaussian(0, sigma_0^2V_(sp))
e ~ Gaussian(0,sigma^2)
where beta_0 and beta_1 are fixed effects, and V_(sp) is a variancecovariance matrix derived from a phylogeny (typically under the assumption of Brownian motion evolution). Here, the variation in the mean (intercept) for each species is given by the random effect b_0 that is assumed to be independent among species. Variation in species' responses to predictor variable x is given by a random effect b_0 that is assumed to depend on the phylogenetic relatedness among species given by V_(sp); if species are closely related, their specific responses to x will be similar. This particular model would be specified as
z < pglmm(Y ~ X + (1sp__), data = data, family = "gaussian", cov_ranef = list(sp = phy))
Or you can prepare the random terms manually (not recommended for simple models but may be necessary for complex models):
re.1 < list(1, sp = dat$sp, covar = diag(nspp))
re.2 < list(dat$X, sp = dat$sp, covar = Vsp)
z < pglmm(Y ~ X, data = data, family = "gaussian", random.effects = list(re.1, re.2))
The covariance matrix covar is standardized to have its determinant equal to 1. This in effect standardizes the interpretation of the scalar sigma^2. Although mathematically this is not required, it is a very good idea to standardize the predictor (independent) variables to have mean 0 and variance 1. This will make the function more robust and improve the interpretation of the regression coefficients. For categorical (factor) predictor variables, you will need to construct 01 dummy variables, and these should not be standardized (for obvious reasons).
For binary generalized linear mixed models (family =
'binomial'
), the function estimates parameters for the model of
the form, for example,
y = beta_0 + beta_1x + b_0 + b_1x
Y = logit^(1)(y)
b_0 ~ Gaussian(0, sigma_0^2I_(sp))
b_0 ~ Gaussian(0, sigma_0^2V_(sp))
where beta_0 and beta_1 are fixed effects, and V_(sp) is a variancecovariance matrix derived from a phylogeny (typically under the assumption of Brownian motion evolution).
z < pglmm(Y ~ X + (1sp__), data = data, family = "binomial", cov_ranef = list(sp = phy))
As with the linear mixed model, it is a very good idea to standardize the predictor (independent) variables to have mean 0 and variance 1. This will make the function more robust and improve the interpretation of the regression coefficients.
An object (list) of class communityPGLMM
with the following elements:
formula 
the formula for fixed effects 
formula_original 
the formula for both fixed effects and random effects 
data 
the dataset 
family 

random.effects 
the list of random effects 
B 
estimates of the regression coefficients 
B.se 
approximate standard errors of the fixed effects regression coefficients.
This is set to NULL if 
B.ci 
approximate Bayesian credible interval of the fixed effects regression coefficients.
This is set to NULL if 
B.cov 
approximate covariance matrix for the fixed effects regression coefficients 
B.zscore 
approximate Z scores for the fixed effects regression coefficients. This is set to NULL if 
B.pvalue 
approximate tests for the fixed effects regression coefficients being different from zero. This is set to NULL if 
ss 
standard deviations of the random effects for the covariance matrix sigma^2 V for each random effect in order. For the linear mixed model, the residual variance is listed last. 
s2r 
random effects variances for nonnested random effects 
s2n 
random effects variances for nested random effects 
s2resid 
for linear mixed models, the residual variance 
s2r.ci 
Bayesian credible interval for random effects variances for nonnested random effects.
This is set to NULL if 
s2n.ci 
Bayesian credible interval for random effects variances for nested random effects.
This is set to NULL if 
s2resid.ci 
Bayesian credible interval for linear mixed models, the residual variance.
This is set to NULL if 
logLik 
for linear mixed models, the loglikelihood for either the restricted likelihood ( 
AIC 
for linear mixed models, the AIC for either the restricted likelihood ( 
BIC 
for linear mixed models, the BIC for either the restricted likelihood ( 
DIC 
for Bayesian PGLMM, this is the Deviance Information Criterion metric of model fit. This is set to NULL if 
REML 
whether or not REML is used ( 
bayes 
whether or not a Bayesian model was fit. 
marginal.summ 
The specified summary statistic used to summarize the Bayesian marginal distributions.
Only present if 
s2.init 
the userprovided initial estimates of 
B.init 
the userprovided initial estimates of 
Y 
the response (dependent) variable returned in matrix form 
X 
the predictor (independent) variables returned in matrix form (including 1s in the first column) 
H 
the residuals. For linear mixed models, this does not account for random terms,
To get residuals after accounting for both fixed and random terms, use 
iV 
the inverse of the covariance matrix for the entire system (of dimension ( 
mu 
predicted mean values for the generalized linear mixed model (i.e., similar to 
nested 
matrices used to construct the nested design matrix. This is set to NULL if 
Zt 
the design matrix for random effects. This is set to NULL if 
St 
diagonal matrix that maps the random effects variances onto the design matrix 
convcode 
the convergence code provided by 
niter 
number of iterations performed by 
inla.model 
Model object fit by underlying 
Anthony R. Ives, Daijiang Li, Russell Dinnage
Ives, A. R. and M. R. Helmus. 2011. Generalized linear mixed models for phylogenetic analyses of community structure. Ecological Monographs 81:511525.
Ives A. R. 2018. Mixed and phylogenetic models: a conceptual introduction to correlated data. https://leanpub.com/correlateddata.
Rafferty, N. E., and A. R. Ives. 2013. Phylogenetic traitbased analyses of ecological networks. Ecology 94:23212333.
Simpson, Daniel, et al. 2017. Penalising model component complexity: A principled, practical approach to constructing priors. Statistical science 32(1): 128.
Li, D., Ives, A. R., & Waller, D. M. 2017. Can functional traits account for phylogenetic signal in community composition? New Phytologist, 214(2), 607618.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196  ## Structure of examples:
# First, a (brief) description of model types, and how they are specified
#  these are *not* to be run 'asis'; they show how models should be organised
# Second, a runthrough of how to simulate, and then analyse, data
#  these *are* to be run 'asis'; they show how to format and work with data
#############################################
### Brief summary of models and their use ###
#############################################
## Model structures from Ives & Helmus (2011)
if(FALSE){
# dat = data set for regression (note: must have a column "sp" and a column "site")
# phy = phylogeny of class "phylo"
# repulsion = to test phylogenetic repulsion or not
# Model 1 (Eq. 1)
z < pglmm(freq ~ sp + (1site) + (1sp__@site), data = dat, family = "binomial",
cov_ranef = list(sp = phy), REML = TRUE, verbose = TRUE, s2.init = .1)
# Model 2 (Eq. 2)
z < pglmm(freq ~ sp + X + (1site) + (Xsp__), data = dat, family = "binomial",
cov_ranef = list(sp = phy), REML = TRUE, verbose = TRUE, s2.init = .1)
# Model 3 (Eq. 3)
z < pglmm(freq ~ sp*X + (1site) + (1sp__@site), data = dat, family = "binomial",
cov_ranef = list(sp = phy), REML = TRUE, verbose = TRUE, s2.init = .1)
## Model structure from Rafferty & Ives (2013) (Eq. 3)
# dat = data set
# phyPol = phylogeny for pollinators (pol)
# phyPlt = phylogeny for plants (plt)
z < pglmm(freq ~ pol * X + (1pol__) + (1plt__) + (1pol__@plt) +
(1pol@plt__) + (1pol__@plt__),
data = dat, family = "binomial",
cov_ranef = list(pol = phyPol, plt = phyPlt),
REML = TRUE, verbose = TRUE, s2.init = .1)
}
#####################################################
### Detailed analysis showing covariance matrices ###
#####################################################
# This is the example from section 4.3 in Ives, A. R. (2018) Mixed
# and phylogenetic models: a conceptual introduction to correlated data.
library(ape)
library(mvtnorm)
# Investigating covariance matrices for different types of model structure
nspp < 6
nsite < 4
# Simulate a phylogeny that has a lot of phylogenetic signal (power = 1.3)
phy < compute.brlen(rtree(n = nspp), method = "Grafen", power = 1.3)
# Simulate species means
sd.sp < 1
mean.sp < rTraitCont(phy, model = "BM", sigma=sd.sp^2)
# Replicate values of mean.sp over sites
Y.sp < rep(mean.sp, times=nsite)
# Simulate site means
sd.site < 1
mean.site < rnorm(nsite, sd=sd.site)
# Replicate values of mean.site over sp
Y.site < rep(mean.site, each=nspp)
# Compute a covariance matrix for phylogenetic attraction
sd.attract < 1
Vphy < vcv(phy)
# Standardize the phylogenetic covariance matrix to have determinant = 1.
# (For an explanation of this standardization, see subsection 4.3.1 in Ives (2018))
Vphy < Vphy/(det(Vphy)^(1/nspp))
# Construct the overall covariance matrix for phylogenetic attraction.
# (For an explanation of Kronecker products, see subsection 4.3.1 in the book)
V < kronecker(diag(nrow = nsite, ncol = nsite), Vphy)
Y.attract < array(t(rmvnorm(n = 1, sigma = sd.attract^2*V)))
# Simulate residual errors
sd.e < 1
Y.e < rnorm(nspp*nsite, sd = sd.e)
# Construct the dataset
d < data.frame(sp = rep(phy$tip.label, times = nsite),
site = rep(1:nsite, each = nspp))
# Simulate abundance data
d$Y < Y.sp + Y.site + Y.attract + Y.e
# Analyze the model
pglmm(Y ~ 1 + (1sp__) + (1site) + (1sp__@site), data = d, cov_ranef = list(sp = phy))
# Display random effects: the function `pglmm_plot_ranef()` does what
# the name implies. You can set `show.image = TRUE` and `show.sim.image = TRUE`
# to see the matrices and simulations.
re < pglmm_plot_ranef(Y ~ 1 + (1sp__) + (1site) + (1sp__@site), data = d,
cov_ranef = list(sp = phy), show.image = FALSE,
show.sim.image = FALSE)
#################################################
### Example of a bipartite phylogenetic model ###
#################################################
# Investigating covariance matrices for different types of model structure
nspp < 20
nsite < 15
# Simulate a phylogeny that has a lot of phylogenetic signal (power = 1.3)
phy.sp < compute.brlen(rtree(n = nspp), method = "Grafen", power = 1.3)
phy.site < compute.brlen(rtree(n = nsite), method = "Grafen", power = 1.3)
# Simulate species means
mean.sp < rTraitCont(phy.sp, model = "BM", sigma = 1)
# Replicate values of mean.sp over sites
Y.sp < rep(mean.sp, times = nsite)
# Simulate site means
mean.site < rTraitCont(phy.site, model = "BM", sigma = 1)
# Replicate values of mean.site over sp
Y.site < rep(mean.site, each = nspp)
# Generate covariance matrix for phylogenetic attraction among species
sd.sp.attract < 1
Vphy.sp < vcv(phy.sp)
Vphy.sp < Vphy.sp/(det(Vphy.sp)^(1/nspp))
V.sp < kronecker(diag(nrow = nsite, ncol = nsite), Vphy.sp)
Y.sp.attract < array(t(rmvnorm(n = 1, sigma = sd.sp.attract^2*V.sp)))
# Generate covariance matrix for phylogenetic attraction among sites
sd.site.attract < 1
Vphy.site < vcv(phy.site)
Vphy.site < Vphy.site/(det(Vphy.site)^(1/nsite))
V.site < kronecker(Vphy.site, diag(nrow = nspp, ncol = nspp))
Y.site.attract < array(t(rmvnorm(n = 1, sigma = sd.site.attract^2*V.site)))
# Generate covariance matrix for phylogenetic attraction of species:site interaction
sd.sp.site.attract < 1
V.sp.site < kronecker(Vphy.site, Vphy.sp)
Y.sp.site.attract < array(t(rmvnorm(n = 1, sigma = sd.sp.site.attract^2*V.sp.site)))
# Simulate residual error
sd.e < 0.5
Y.e < rnorm(nspp*nsite, sd = sd.e)
# Construct the dataset
d < data.frame(sp = rep(phy.sp$tip.label, times = nsite),
site = rep(phy.site$tip.label, each = nspp))
# Simulate abundance data
d$Y < Y.sp + Y.site + Y.sp.attract + Y.site.attract + Y.sp.site.attract + Y.e
# Plot random effects covariance matrices and then add phylogenies
# Note that, if show.image and show.sim are not specified, pglmm_plot_ranef() shows
# the covariance matrices if nspp * nsite < 200 and shows simulations
# if nspp * nsite > 100
re < pglmm_plot_ranef(Y ~ 1 + (1sp__) + (1site__) + (1sp__@site) +
(1sp@site__) + (1sp__@site__),
data=d, cov_ranef = list(sp = phy.sp, site = phy.site))
# This flips the phylogeny to match to covariance matrices
rot.phy.site < phy.site
for(i in (nsite+1):(nsite+Nnode(phy.site)))
rot.phy.site < rotate(rot.phy.site, node = i)
plot(phy.sp, main = "Species", direction = "upward")
plot(rot.phy.site, main = "Site")
# Analyze the simulated data and compute a Pvalue for the (1sp__@site__)
# random effect using a LRT. It is often better to fit the reduced model before
# the full model, because it s numerically easier to fit the reduced model,
# and then the parameter estimates from the reduced model can be given to the
# full model. In this case, I have used the estimates of the random effects
# from the reduce model, mod.r$ss, as the initial estimates for the same
# parameters in the full model in the statement s2.init=c(mod.r$ss, 0.01)^2.
# The final 0.01 is for the last random effect in the full model, (1sp__@site__).
# Note also that the output of the random effects from communityPGLMM(), mod.r$ss,
# are the standard deviations, so they have to be squared for use as initial
# values of variances in mod.f.
mod.r < pglmm(Y ~ 1 + (1sp__) + (1site__) + (1sp__@site) + (1sp@site__),
data = d, cov_ranef = list(sp = phy.sp, site = phy.site))
mod.f < pglmm(Y ~ 1 + (1sp__) + (1site__) + (1sp__@site) + (1sp@site__) +
(1sp__@site__), data = d,
cov_ranef = list(sp = phy.sp, site = phy.site),
s2.init = c(mod.r$ss, 0.01)^2)
mod.f
pvalue < pchisq(2*(mod.f$logLik  mod.r$logLik), df = 1, lower.tail = FALSE)
pvalue

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.