Fit Co-Correspondence Analysis Ordination Models

Share:

Description

coca is used to fit Co-Correspondence Analysis (CoCA) models. It can fit predictive or symmetric models to two community data matrices containing species abundance data.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
coca(y, ...)

## Default S3 method:
coca(y, x, method = c("predictive", "symmetric"),
     reg.method = c("simpls", "eigen"), weights = NULL,
     n.axes = NULL, symmetric = FALSE, ...)

## S3 method for class 'formula'
coca(formula, data, method = c("predictive", "symmetric"),
     reg.method = c("simpls", "eigen"), weights = NULL,
     n.axes = NULL, symmetric = FALSE, ...)

Arguments

y

a data frame containing the response community data matrix.

x

a data frame containing the predictor community data matrix.

formula

a symbolic description of the model to be fit. The details of model specification are given below.

data

an optional data frame containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which coca is called.

method

a character string indicating which co-correspondence analysis method to use. One of "predictive"(default), or "symmetric", can be abbreviated.

reg.method

One of "simpls" (default) or "eigen". If method is "predictive" then reg.method controls whether the co-correspondence analysis should be fitted using the SIMPLS algorithm or via an eigen analysis.

weights

a vector of length nrow(y) of user supplied weights for R_0. If weights = NULL (default) then the weights are determined from y (default) or x and y (symmetric = TRUE only).

n.axes

the number of CoCA axes to extract. If missing (default) the n.axes is

min(ncol(y), ncol(x), nrow(y), nrow(x)) - 1

.

symmetric

if method is "symmetric" then symmetric determines whether weights for R_0 are symmetric and taken as the average of the row sums of x and y (symmetric = TRUE). If symmetric = FALSE (default) then the weights R_0 are taken as the row sums of y unless user defined weights are provided via argument weights. Ignored if method is "predictive".

...

additional arguments to be passed to lower level methods.

Details

coca is the main user-callable function.

A typical model has the form response ~ terms where response is the (numeric) response data frame and terms is a series of terms which specifies a linear predictor for response. A typical form for terms is ., which is shorthand for "all variables" in data. If . is used, data must also be provided. If specific species (variables) are required then terms should take the form spp1 + spp2 + spp3.

The default is to fit a predictive CoCA model using SIMPLS via a modified version of simpls.fit from package pls. Alternatively, reg.method = "eigen" fits the model using an older, slower eigen analysis version of the SIMPLS algorithm. reg.method = "eigen" is about 100% slower than reg.method = "simpls".

Value

coca returns a list with method and reg.method determining the actual components returned.

nam.dat

list with components namY and namX containing the names of the response and the predictor(s) respectively.

call

the matched call.

method

the CoCA method used, one of "predictive" or "symmetric".

scores

the species and site scores of the fitted model.

loadings

the site loadings of the fitted model for the response and the predictor. (Predictive CoCA via SIMPLS only.)

fitted

the fitted values for the response. A list with 2 components Yhat (the fitted values) and Yhat1 (the transformed fitted values. (Predictive CoCA via SIMPLS only.)

varianceExp

list with components Yblock and Xblock containing the variances in the response and the predictor respectively, explained by each fitted PLS axis. (Predictive CoCA via SIMPLS only.)

totalVar

list with components Yblock and Xblock containing the total variance in the response and the predictor respectively. (Predictive CoCA via SIMPLS only.)

lambda

the Eigenvalues of the analysis.

n.axes

the number of fitted axes

Ychi

a list containing the mean-centered chi-square matrices for the response (Ychi1) and the predictor (Ychi2). (Predictive CoCA only.)

R0

the (possibly user-supplied) row weights used in the analysis.

X

X-Matrix (symmetric CoCA only).

residuals

Residuals of a symmetric model (symmetric CoCA only).

inertia

list with components total and residual containing the total and residual inertia for the response and the predictor (symmetric CoCA only).

rowsum

a list with the row sums for the response (rsum1) and the preditor (rsum2) (symmetric CoCA only).

colsum

a list with the column sums for the response (csum1)and the preditor (csum2) (symmetric CoCA only).

Author(s)

Original Matlab code by C.J.F. ter Braak and A.P. Schaffers. R port by Gavin L. Simpson. Formula method for coca uses a modified version of ordiParseFormula by Jari Oksanen to handle formulea.

References

ter Braak, C.J.F and Schaffers, A.P. (2004) Co-Correspondence Analysis: a new ordination method to relate two community compositions. Ecology 85(3), 834–846

See Also

crossval for cross-validation and permutest.coca for permutation test to determine the number of PLS axes to retain in for predictive CoCA.

summary.predcoca and summary.symcoca for summary methods.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
## symmetric CoCA
data(beetles)
## log transform the bettle data
beetles <- log(beetles + 1)
data(plants)
## fit the model
bp.sym <- coca(beetles ~ ., data = plants, method = "symmetric")
bp.sym
summary(bp.sym)
plot(bp.sym)

## predictive CoCA using SIMPLS and formula interface
bp.pred <- coca(beetles ~ ., data = plants)
## should retain only the useful PLS components for a parsimonious model

## Leave-one-out crossvalidation - this takes a while

## Not run: 
crossval(beetles, plants)
## so 2 axes are sufficient
## permutation test to assess significant PLS components - takes a while
bp.perm <- permutest(bp.pred, permutations = 99)
bp.perm
## End(Not run)

## agrees with the Leave-one-out cross-validation
## refit the model with only 2 PLS components
bp.pred <- coca(beetles ~ ., data = plants, n.axes = 2)
bp.pred
summary(bp.pred)
plot(bp.pred)

## predictive CoCA using Eigen-analysis
data(bryophyte)
data(vascular)
carp.pred <- coca(y = bryophyte, x = vascular, reg.method = "eigen")
carp.pred

## determine important PLS components - takes a while
## Not run: 
crossval(bryophyte, vascular)
(carp.perm <- permutest(carp.pred, permutations = 99))
## End(Not run)

## 2 components again, refit
carp.pred <- coca(y = bryophyte, x = vascular,
                  reg.method = "eigen", n.axes = 2)
carp.pred
## plot
plot(carp.pred)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.