calc.varpart: Variance partitioning for a latent variable model

Description Usage Arguments Details Value Warnings Author(s) References Examples

View source: R/varpart.R

Description

\Sexpr[results=rd, stage=render]{lifecycle::badge("stable")}

For each response (species), partition the variance of the linear predictor into components associated with (groups of) the covariates, the latent variables, and any row effects and response-specific random intercepts. If traits are also included in the model, then it also calculates an R-squared value for the proportion of the variance in the environmental response (due to the covariates) which can be explained by traits.

Usage

1
calc.varpart(object, groupX = NULL)

Arguments

object

An object of class "boral".

groupX

A vector of group indicator variables, which allows the variance partitioning to be done for groups of covariates (including the intercept) i.e., how much of the total variation does a certain subset of the covariates explain. Defaults to NULL, in whih case all the covariates are treated as single group.

Details

As an alternative to looking at differences in trace of the residual covariance matrix (Hui et al., 2014; Warton et al., 2015), an alternative way to quantify the amount of variance explained by covariates, traits, row effects, response-specific random intercepts, is to perform a variance decomposition of the linear predictor of a latent variable model (Ovaskainen et al., 2017). In particular, for a general model the linear predictor for response j = 1,…,p at row i = 1,…,n is given by

η_{ij} = α_i + β_{0j} + \bm{x}^\top_i\bm{β}_j + \bm{z}^\top_i\bm{b}_j + \bm{u}^\top_i\bm{θ}_j,

where β_{0j} + \bm{x}^\top_i\bm{β}_j is the component of the linear predictor due to the covariates \bm{X} plus an intercept, \bm{z}^\top_i\bm{b}_j is the component due to response-specific random intercept, \bm{u}^\top_i\bm{θ}_j is the component due to the latent variables, and α_i is the component due to one or more fixed or random row effects. Not all of these components may be included in the model, and the above is just representing the general case. The regression coefficients \bm{β}_j may be further as random effects and regressed against traits; please see about.traits for further information on this.

For the response, a variation partitioning of the linear predictor is performed by calculating the variance due to the components in η_{ij} and then rescaling them to ensure that they sum to one. The general details of this type of variation partitioning is given in Ovaskainen et al., (2017); see also Nakagawa and Schielzeth (2013) for R-squared and proportion of variance explained in the case of generalized linear mixed model. In brief, for response j = 1,…,p:

After scaling, we can then obtain the proportion of variance for each response which is explained by the variance components. These proportions are calculated for each MCMC sample and then average acrossed them to calculate a posterior mean variance partitioning.

If groupX is supplied, the variance due to the covariates is done based on subsets of the covariates (including the intercept) as identified by codegroupX, and then rescaled correspondingly. This is useful if one was to, for example, quantify the proportion of variation in each response which is explained by each covariate.

If a fitted model also containing traits, which are included to help explain/mediate differences in species environmental responses, then the function calculates R^2 value for the proportion of variance in the covariates which is explained by the traits. In brief, this is calculated based the correlation between β_{0j} + \bm{x}^\top_i\bm{β}_j and τ_{0j} + \bm{x}^\top_i\bm{τ}_j, where τ_{0j} and \bm{τ}_j are the “predicted" values of the species coefficients based on values i.e., τ_{0j} = κ_{01} + \bm{traits}^\top_j\bm{κ}_1 and τ_{jk} = κ_{0k} + \bm{traits}^\top_j\bm{κ}_k for element k in \bm{τ}_j.

Value

A list containing the following components, if applicable:

varpart.X

Vector containing the proportion of variance (in the linear predictor) for each response, which is explained by the covariate matrix.

varpart.lv

Vector containing the proportion of variance (in the linear predictor) for each response, which is explained by the latent variables.

varpart.row

Vector containing the proportion of variance (in the linear predictor) for each response, which is explained by the row effects.

varpart.ranef

Vector containing the proportion of variance (in the linear predictor) for each response, which is explained by the response-specific random intercepts.

R2.traits

Vector containing the proportion of variance due to the covariates for each response, which can be explained by traits for each response.

Warnings

There is considerable controversy over exactly what quantities such as R-squared and proportion of variance explained are in the case mixed models and latent variable models, and how they can interpreted e.g., what is considered a high value for the proportion of variance by the covariates, is it consistent with whether the coefficients are significantly different from zero or not; see for instance R2 controversy.

When reporting these values, researchers should be at least aware of this and that there are multiple ways of manufacturing such quantities, with no single best approach e.g., using relative changes in trace of the residual covariance matrix, relative changes in marginal and conditional log-likelihoods are other possible approaches.

Author(s)

Francis K.C. Hui [aut, cre], Wade Blanchard [aut]

Maintainer: Francis K.C. Hui <fhui28@gmail.com>

References

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
## Not run: 
library(mvabund) ## Load a dataset from the mvabund package
data(spider)
y <- spider$abun
X <- scale(spider$x)
n <- nrow(y)
p <- ncol(y)

## NOTE: The values below MUST NOT be used in a real application;
## they are only used here to make the examples run quick!!!
example_mcmc_control <- list(n.burnin = 10, n.iteration = 100, 
     n.thin = 1)

testpath <- file.path(tempdir(), "jagsboralmodel.txt")


## Example 1 - model with X variables, two latent variables, and no row effects
spiderfit_nb <- boral(y, X = X, family = "negative.binomial", 
     lv.control = list(num.lv = 2), 
     save.model = TRUE, mcmc.control = example_mcmc_control,
     model.name = testpath)

## Partition variance for each species into that explained by covariates 
## and by the latent variables
dovar <- calc.varpart(spiderfit_nb)

## Consider the intercept and first two covariates in X as one group, 
## and remaining four covariates in X as another group, 
## then partition variance for each species based on these groups.
dovar <- calc.varpart(spiderfit_nb, groupX = c(1,1,1,2,2,2,2))


## Example 1b - model with X variables, two latent variables, and 
## species-specific random intercepts at a so-called region level
spiderfit_nb <- boral(y, X = X, family = "negative.binomial", 
    lv.control = list(num.lv = 2),
    ranef.ids = data.frame(subregion = rep(1:7,each=4)), 
    save.model = TRUE, mcmc.control = example_mcmc_control, 
    model.name = testpath) 

## Partition variance for each species into that explained by covariates 
## and by the latent variables
dovar <- calc.varpart(spiderfit_nb)

## Consider the intercept and first two covariates in X as one group, 
## and remaining four covariates in X as another group, 
## then partition variance for each species based on these groups.
dovar <- calc.varpart(spiderfit_nb, groupX = c(1,1,1,2,2,2,2))


## Example 2 - model fitted to count data, no site effects, and
## two latent variables, plus traits included to explain environmental responses
data(antTraits)
y <- antTraits$abun
X <- as.matrix(scale(antTraits$env))
## Include only traits 1, 2, and 5
traits <- as.matrix(antTraits$traits[,c(1,2,5)])
example_which_traits <- vector("list",ncol(X)+1)
for(i in 1:length(example_which_traits)) 
     example_which_traits[[i]] <- 1:ncol(traits)
## Just for fun, the regression coefficients for the second column of X,
## corresponding to the third element in the list example_which_traits,
## will be estimated separately and not regressed against traits.
example_which_traits[[3]] <- 0

fit_traits <- boral(y, X = X, traits = traits, which.traits = example_which_traits, 
    family = "negative.binomial", mcmc.control = example_mcmc_control, 
    save.model = TRUE, model.name = testpath)

## Partition variance for each species due to covariates in X 
## and latent variables. Also calculate proportion of variance 
## due to the covariates which can be explained by traits 
dovar <- calc.varpart(fit_traits)

## End(Not run)

boral documentation built on March 12, 2021, 5:07 p.m.