# Fits a fourth corner model for abundance as a function of environmental variables and species traits.

### Description

Fits a fourth corner model - a model to study how variation in environmental response across taxa can be explained by their traits. The function to use for fitting can be (pretty well) any predictive model, default is a generalised linear model, another good option is to add a LASSO penalty via `glm1path`

. Can handle overdispersed counts via `family="negative.binomial"`

, which is the default `family`

argument.

### Usage

1 2 3 |

### Arguments

`L` |
A data frame (or matrix) containing the abundances for each taxon (columns) across all sites (rows). |

`R` |
A data frame (or matrix) of environmental variables (columns) across all sites (rows). |

`Q` |
A data frame (or matrix) of traits (columns) across all taxa (rows). If not specified, a different environmental response will be specified for each taxon. |

`family` |
The family of the response variable, see |

`formula` |
A one-sided formula specifying exactly how to model abundance as a function of environmental and trait variables (as found in |

`method` |
The function to use to fit the model. Default is |

`composition` |
logical. TRUE includes a row effect in the model, adjusting for different sampling intensities across different samples. This can be understood as a compositional term in the sense that all other terms then model relative abundance at a site. FALSE (default) does not include a row effect, hence the model is of absolute abundance. |

`col.intercepts` |
logical. TRUE (default) includes a column effect in the model, to adjust for different levels of abundance of different response (column) variables. FALSE removes this column effect. |

`...` |
Arguments passed to the function specified at |

### Details

This function fits a fourth corner model, that is, a model to predict abundance across several taxa (stored in `L`

) as a function of environmental variables (`R`

) and traits (`Q`

). The environment-trait interaction can be understood as the fourth corner, giving the set of coefficients that describe how environmental response across taxa varies as traits vary. A species effect is include in the model (i.e. a different intercept term for each species), so that traits are used to explain patterns in relative abundance across taxa not patterns in absolute abundance.

The actual function used to fit the model is determined by the user through the `method`

argument. The default is to use `manyglm`

to fit a GLM, although for predictive modelling, it might be better to use a LASSO penalty as in `glm1path`

and `cv.glm1path`

. In `glm1path`

, the penalty used for BIC calculation is `log(dim(L)[1])`

, i.e. log(number of sites).

The model is fitted by vectorising `L`

then constructing a big matrix from repeated values of `R`

, `Q`

, their quadratic terms (if required) and interactions. Hence this function will hit memory issues if any of these matrices are large, and can slow down (especially if using `cv.glm1path`

). If `formula`

is left unspecified, the design matrix is constructed using all environmental variables and traits specified in `R`

and `Q`

, and quadratic terms for any of these variables that are quantitative, and all environment-trait interactions, after standardising these variables. Specifying a one-sided `formula`

as a function of the variables in `R`

and `Q`

would instead give the user control over the precise model that is fitted, and drops the internal standardisations. The arguments `composition`

and `col.intercepts`

optionally add terms to the model for row and column total abundance, irrespective of whether a `formula`

has been specified.

Note: when specifying a `formula`

, if there are no penalties on coefficients (as for `manyglm`

), then main effects for `R`

can be excluded if including row effects (via `composition=TRUE`

), and main effects for `Q`

can be excluded if including column effects (via `col.intercepts=TRUE`

), because those terms are redundant (trying to explain main effects for row/column when these main effects are already in the model). If using penalised likelihood (as in `glm1path`

and `cv.glm1path`

) or a random effects model, by all means include main effects as well as row/column effects, and the penalties will sort out which terms to use.

If trait matrix `Q`

is not specified, default behaviour will fit a different environmental response for each taxon (and the outcome will be very similar to `manyglm(L~R)`

). This can be understood as a fourth corner model where species identities are used as the species traits (i.e. no attempt is made to explain differences across species).

These functions inherit default behaviour from their fitting functions. e.g. use `plot`

for a Dunn-Smyth residual plot from a traits model fitted using `manyglm`

or `glm1path`

.

### Value

Returns a `traitglm`

object, a list that contains at least the following components:

- ...
Exactly what is included in output depends on the fitting function - by default, a

`manyglm`

object is returned, so all usual`manyglm`

output is included (coefficients, residuals, deviance, etc).- family
A

`family`

object matching the final model.- fourth.corner
A matrix of fourth corner coefficients. If

`formula`

has been manually entered, this will be a vector not a matrix.- R.des
The reduced-size design matrix for environmental variables, including further arguments:

- X
Data frame of (possibly standardised) environmental variables

- X.squ
A data frame containing the leading term in a quadratic expression (where appropriate) for environmental variables

- var.type
A vector with the same dimension as the number of columns of X, listing the type of ecah enviromental variable (

`"quantitative"`

" or`"factor"`

")- coefs
Coefficients used in transforming variables to orthogonality. These are used later to make predictions.

- Q.des
The reduced-size design matrix for traits, set up as for

`R.des`

.- spp.penalty
For LASSO fits: a vector of the same length as the final design matrix, indicating which variables had a penalty imposed on them in model fitting.

- L
The data frame of abundances specified as input.

- any.penalty
Logical, is any penalty applied to parameters at all (not if using a

`manyglm`

fit).- scaling
A list of coefficients describing the standaridsations of variables used in analyses. Stored for use later if making predictions.

- call
The original call

`traitglm`

call.

### Author(s)

David I. Warton <David.Warton@unsw.edu.au>

### References

Brown AM, Warton DI, Andrew NR, Binns M, Cassis G and Gibb H (2014) The fourth corner solution - using species traits to better understand how species traits interact with their environment, Methods in Ecology and Evolution 5, 344-352.

Warton DI, Shipley B & Hastie T (2015) CATS regression - a model-based approach to studying trait-based community assembly, Methods in Ecology and Evolution 6, 389-398.

### See Also

`glm1path`

, `glm1`

, `manyglm`

, `family`

, `residuals.manyglm`

, `plot.manyany`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ```
data(antTraits)
ft=traitglm(antTraits$abund,antTraits$env,antTraits$traits,method="manyglm")
ft$fourth #print fourth corner terms
# for a pretty picture of fourth corner coefficients, uncomment the following lines:
# library(lattice)
# a = max( abs(ft$fourth.corner) )
# colort = colorRampPalette(c("blue","white","red"))
# plot.4th = levelplot(t(as.matrix(ft$fourth.corner)), xlab="Environmental Variables",
# ylab="Species traits", col.regions=colort(100), at=seq(-a, a, length=100),
# scales = list( x= list(rot = 45)))
# print(plot.4th)
plot(ft) # for a Dunn-smyth residual plot
qqnorm(residuals(ft)); abline(c(0,1),col="red") # for a normal quantile plot.
# predict to the first five sites
predict(ft,newR=antTraits$env[1:5,])
# refit using LASSO and less variables, including row effects and only two interaction terms:
ft1=traitglm(antTraits$abund,antTraits$env[,3:4],antTraits$traits[,c(1,3)],
formula=~Shrub.cover:Femur.length+Shrub.cover:Pilosity,composition=TRUE,method="glm1path")
ft1$fourth #notice LASSO penalty has one interaction to zero
``` |