| dc_CA | R Documentation |
Double constrained correspondence analysis (dc-CA) for analyzing
(multi-)trait (multi-)environment ecological data using library vegan
and native R code. It has a formula interface which allows to assess,
for example, the importance of trait interactions in shaping ecological
communities. The function dc_CA has an option to divide the abundance
data of a site by the site total, giving equal site weights. This division
has the advantage that the multivariate analysis corresponds with an
unweighted (multi-trait) community-level analysis, instead of being weighted
(Kleyer et al. 2012, ter Braak and van Rossum, 2025).
dc_CA(
formulaEnv = NULL,
formulaTraits = NULL,
response = NULL,
dataEnv = NULL,
dataTraits = NULL,
divideBySiteTotals = NULL,
dc_CA_object = NULL,
env_explain = TRUE,
use_vegan_cca = FALSE,
verbose = TRUE
)
formulaEnv |
two-sided or one-sided formula for the rows (samples) with
row predictors in |
formulaTraits |
formula or one-sided formula for the columns (species)
with column predictors in |
response |
matrix, data frame of the abundance data
(dimension n x m) or list with community weighted means (CWMs)
from |
dataEnv |
matrix or data frame of the row predictors, with rows
corresponding to those in |
dataTraits |
matrix or data frame of the column predictors, with rows
corresponding to the columns in |
divideBySiteTotals |
logical; default |
dc_CA_object |
optional object from an earlier run of this function.
Useful if the same formula for the columns ( |
env_explain |
logical (default |
use_vegan_cca |
default |
verbose |
logical for printing a simple summary (default: TRUE) |
Empty (all zero) rows and columns in response are removed from the
response and the corresponding rows from dataEnv and
dataTraits. Subsequently, any columns with missing values are
removed from dataEnv and dataTraits. It gives an error
('name_of_variable' not found), if variables with missing entries are
specified in formulaEnv and formulaTraits.
Computationally, dc-CA can be carried out by a single singular value
decomposition (ter Braak et al. 2018), but it is here computed in two steps.
In the first step, the transpose of the response is regressed on to
the traits (the column predictors) using cca with
formulaTraits. The column scores of this analysis (in scaling 1) are
community weighted means (CWM) of the orthonormalized traits. These are then
regressed on the environmental (row) predictors using wrda
with formulaEnv or using rda, if site weights
are equal.
A dc-CA can be carried out on, what statisticians call, the sufficient
statistics of the method. This is useful, when the abundance data are not
available or could not be made public in a paper attempting reproducible
research. In this case, response should be a list
with as first element community weighted means
(e.g. list(CWM = CWMs)) with respect to the
traits, and the trait data, and, optionally, further list elements, for functions
related to dc_CA. The minimum is a
list(CWM = CWMs, weight = list(columns = species_weights)) with CWM a matrix
or data.frame, but then formulaEnv, formulaTraits,
dataEnv, dataTraits must be specified in the call to
dc_CA. The function fCWM_SNC and its example
show how to set the
response for this and helps to create the response from
abundance data in these non-standard applications of dc-CA. Species and site
weights, if not set in response$weights can be set by a variable
weight in the data frames dataTraits and dataEnv,
respectively, but formulas should then not be ~..
The statistics and scores in the example dune_dcCA.r, have been
checked against the results in Canoco 5.15 (ter Braak & Šmilauer, 2018).
A list of class dcca; that is a list with elements
a cca.object from the
cca analysis of the transpose of the closed
response using formula formulaTraits.
the argument formulaTraits. If the formula was
~., it was changed to explicit trait names.
a list of Y, dataEnv and dataTraits,
after removing empty rows and columns in response and after closure if
divideBySiteTotals = TRUE and with the corresponding rows in
dataEnv and dataTraits removed.
a list of unit-sum weights of columns and rows. The names of
the list are c("columns", "rows"), in that order.
number of sites (rows).
Community weighted means w.r.t. orthonormalized traits.
a wrda object or
cca.object from the
wrda or, if with equal row weights,
rda analysis, respectively of the column scores of the
cca, which are the CWMs of orthonormalized traits, using formula
formulaEnv.
the argument formulaEnv. If the formula was
~., it was changed to explicit environmental variable names.
the dc-CA eigenvalues (same as those of the
rda analysis).
mean, sd, VIF and (regression) coefficients of
the traits that define the dc-CA axes in terms of the
traits with t-ratios missing indicated by NAs for 'tval1'.
a one-column matrix with, at most, six inertias (weighted variances):
total: the total inertia.
conditionT: the inertia explained by the condition in
formulaTraits if present (neglecting row constraints).
traits_explain: the trait-structured variation, i.e.
the inertia explained by the traits (without constaints on the rows and
conditional on the Condition in formulaTraits).
This is the maximum that the row predictors could explain in dc-CA
(the sum of the last two items is thus less than this value).
env_explain: the environmentally structured variation, i.e.
the inertia explained by the environment (without constraints on the
columns but conditional on the Condition formulaEnv).
This is the maximum that the column predictors could explain in
dc-CA (the item constraintsTE is thus less than this value).
The value is NA, if there is collinearity in the environmental data.
conditionTE: the trait-constrained variation explained by the
condition in formulaEnv.
constraintsTE: the trait-constrained variation explained by the predictors (without the row covariates).
If verbose is TRUE (or after out <- print(out) is
invoked) there are three more items.
c_traits_normed: mean, sd, VIF and (regression) coefficients of
the traits that define the dc-CA trait axes (composite traits), and their
optimistic t-ratio.
c_env_normed: mean, sd, VIF and (regression) coefficients of
the environmental variables that define the dc-CA axes in terms of the
environmental variables (composite gradients), and their optimistic t-ratio.
species_axes: a list with four items
species_scores: a list with names
c("species_scores_unconstrained", "lc_traits_scores") with the
matrix with species niche centroids along the dc-CA axes (composite
gradients) and the matrix with linear combinations of traits.
correlation: a matrix with inter-set correlations of the
traits with their SNCs.
b_se: a matrix with (unstandardized) regression coefficients
for traits and their optimistic standard errors.
R2_traits: a vector with coefficient of determination (R2)
of the SNCs on to the traits. The square-root thereof could be called
the species-trait correlation in analogy with the species-environment
correlation in CCA.
sites_axes: a list with four items
site_scores: a list with names
c("site_scores_unconstrained", "lc_env_scores") with the matrix
with community weighted means (CWMs) along the dc-CA axes (composite
gradients) and the matrix with linear combinations of environmental
variables.
correlation: a matrix with inter-set correlations of the
environmental variables with their CWMs.
b_se: a matrix with (unstandardized) regression coefficients
for environmental variables and their optimistic standard errors.
R2_env: a vector with coefficient of determination (R2) of
the CWMs on to the environmental variables. The square-root thereof
has been called the species-environmental correlation in CCA.
All scores in the dcca object are in scaling "sites" (1):
the scaling with Focus on Case distances .
Kleyer, M., Dray, S., Bello, F., Lepš, J., Pakeman, R.J., Strauss, B., Thuiller, W. & Lavorel, S. (2012) Assessing species and community functional responses to environmental gradients: which multivariate methods? Journal of Vegetation Science, 23, 805-821. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/j.1654-1103.2012.01402.x")}
ter Braak, CJF, Šmilauer P, and Dray S. (2018). Algorithms and biplots for double constrained correspondence analysis. Environmental and Ecological Statistics, 25(2), 171-197. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s10651-017-0395-x")}
ter Braak C.J.F. and P. Šmilauer (2018). Canoco reference manual and user's guide: software for ordination (version 5.1x). Microcomputer Power, Ithaca, USA, 536 pp.
ter Braak, C.J.F. and van Rossum, B. (2025). Linking Multivariate Trait Variation to the Environment: Advantages of Double Constrained Correspondence Analysis with the R Package Douconca. Ecological Informatics, 88. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.ecoinf.2025.103143")}
Oksanen, J., et al. (2024). vegan: Community Ecology Package. R package version 2.6-6.1. https://CRAN.R-project.org/package=vegan.
plot.dcca, scores.dcca,
print.dcca and anova.dcca
data("dune_trait_env")
# rownames are carried forward in results
rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites
abun <- dune_trait_env$comm[, -1] # must delete "Sites"
mod <- dc_CA(formulaEnv = abun ~ A1 + Moist + Mag + Use + Manure,
formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan,
dataEnv = dune_trait_env$envir,
dataTraits = dune_trait_env$traits,
verbose = FALSE)
print(mod) # same output as with verbose = TRUE (the default of verbose).
anova(mod, by = "axis")
# For more demo on testing, see demo dune_test.r
mod_scores <- scores(mod)
# correlation of axes with a variable that is not in the model
scores(mod, display = "cor", scaling = "sym", which_cor = list(NULL, "X_lot"))
cat("head of unconstrained site scores, with meaning\n")
print(head(mod_scores$sites))
mod_scores_tidy <- scores(mod, tidy = TRUE)
print("names of the tidy scores")
print(names(mod_scores_tidy))
cat("\nThe levels of the tidy scores\n")
print(levels(mod_scores_tidy$score))
cat("\nFor illustration: a dc-CA model with a trait covariate\n")
mod2 <- dc_CA(formulaEnv = abun ~ A1 + Moist + Mag + Use + Manure,
formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass),
dataEnv = dune_trait_env$envir,
dataTraits = dune_trait_env$traits)
cat("\nFor illustration: a dc-CA model with both environmental and trait covariates\n")
mod3 <- dc_CA(formulaEnv = abun ~ A1 + Moist + Use + Manure + Condition(Mag),
formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass),
dataEnv = dune_trait_env$envir,
dataTraits = dune_trait_env$traits,
verbose = FALSE)
cat("\nFor illustration: same model but using dc_CA_object = mod2 for speed, ",
"as the trait model and data did not change\n")
mod3B <- dc_CA(formulaEnv = abun ~ A1 + Moist + Use + Manure + Condition(Mag),
dataEnv = dune_trait_env$envir,
dc_CA_object = mod2,
verbose= FALSE)
cat("\ncheck on equality of mod3 (from data) and mod3B (from a dc_CA_object)\n",
"the expected difference is in the component 'call'\n ")
print(all.equal(mod3[-c(5,12)], mod3B[-c(5,12)])) # only the component call differs
print(mod3$inertia[-c(3,5),]/mod3B$inertia) # and mod3 has two more inertia items
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.