Nothing
#' Morphological disparity for one or more groups of specimens
#'
#' Function estimates morphological disparity and performs pairwise comparisons among groups
#'
#' The function estimates morphological disparity and performs pairwise comparisons to identify differences
#' among groups. Morphological disparity is estimated as the Procrustes variance, overall or for groups,
#' using residuals of a linear model fit. Procrustes variance is the same sum of the diagonal elements
#' of the group covariance matrix divided by the number of observations in the group (e.g., Zelditch et al. 2012).
#' The function takes as input a formula to describe the linear model fit,
#' plus a formulaic indication of groups (e.g., ~ groups). It is assumed that the formula describes shape data that
#' have been GPA-aligned [e.g., \code{\link{gpagen}}], although the function can work with any multivariate data.
#'
#' Partial disparities (Foote 1993) can also be calculated, but only for model formulas containing only an intercept
#' (e.g., coords ~ 1). Partial disparity has the same numerator as Procrustes variance (with respect to an overall mean)
#' but the denominator is N - 1 for all N observations, rather than n, the group size. (The sum of all group n equals N.)
#' Partial disparities have the appeal that the sum of group partial disparities it the total disparity.
#'
#' Absolute differences in Procrustes variances are test statistics that can be used to test differences
#' in morphological disparity among groups. These differences are statistically evaluated through permutation,
#' where the vectors of residuals are randomized among groups. The function can be used to obtain disparity for the whole
#' dataset by using "a dummy group factor "~ 1" as the right-hand portion of the formula, in which case only Procrustes
#' variance is returned. Additionally, if the right-hand portion of the formula only contains (continuous) covariates,
#' e.g., "~ Csize", Procrustes variance will be calculated for the whole data set or groups, after accounting for the
#' linear regression described. Finally, different factors can be indicated in the formula and for groups, if one wishes to
#' compare morphological disparities for groups comprising only a portion of or collapsing of the groups in a more complex model
#' (see examples).
#'
#' This function can be used with an object of class "procD.lm" or "lm.rrpp", if such analyses have already been performed.
#' This is especially useful for analyses performed with \code{\link{procD.pgls}}. In this case, residuals obtained from PGLS estimation
#' of coefficients, rather than OLS estimation, will be used in the analysis. Thus, one can account for phylogeny when comparing
#' morphological disparity among groups. However, one should be aware that PGLS finds the PGLS residuals and transforms them via the
#' phylogenetic covariance matrix in order to calculate variance. Thus, disparity (dispersion in tangent space) and variance (dispersion
#' in transformed space, after accounting for phylogeny) can be considered different things. This function uses an argument,
#' \code{transform}, to allow users to use either GLS residuals in tangent space or transformed residuals in the transformed space to
#' calculate disparity. The former characterizes the dispersion of actual shapes around GLS means; the latter estimates GLS variances,
#' by group.
#'
#' \subsection{Notes for geomorph 3.1.0 and subsequent versions}{
#' The function \code{\link[RRPP]{pairwise}} in the \code{RRPP} package can also be used to evaluate morphological
#' disparity, and will yield results identical to those of the current function. A simple example is shown below
#' }
#'
#' @param f1 A formula describing the linear model used. The left-hand portion of the formula should be
#' a 3D array (p x k x n) containing Procrustes shape variables for a set of specimens, or a matrix (n x variables).
#' The right-hand portion of the formula should be " ~1" to use the overall mean, or "~ x1 + x2 + x3 +...", where each x is a
#' covariate or factor. (Interactions and nested terms also work.) Alternatively, one can use an object of class "procD.lm" or
#' "lm.rrpp", which already has a formula defined. This is especially helpful for analyses performed with \code{\link{procD.pgls}},
#' as residuals from PGLS will be used for analysis (see examples).
#' @param groups Either a formula designating groups, e.g., groups = ~ groups, or a factor, or
#' a vector coercible to factor. If NULL and if f1 is a procD.lm or lm.rrpp model fit, morphol.disparity
#' will attempt to define groups based on the terms of the model. If there are no groups inherently
#' indicated in f1 and groups is NULL, a single Procrustes variance will be returned for the entire data set.
#' @param partial A logical value to indicate whether partial disparities should be calculated, sensu
#' Foote (1993). If TRUE, the model formula should have only an intercept (e.g., coords ~ 1); otherwise an error
#' will be returned.
#' @param transform A logical value to indicate whether disparity should be measured in the transformed
#' space rather than the tangent space. This is only applicable if a PGLS model is used, in which case
#' the transformation would be necessary for a variance that is independent of phylogeny. If transform = FALSE,
#' disparity tracks the dispersion one would observe in a phylo-PC plot. If transform = TRUE, disparity tracks
#' the dispersion in a PC plot of the transformed space (lacking phylogenetic signal).
#' See Collyer and Adams 2021, and examples below.
#' @param iter Number of iterations for permutation test
#' @param seed An optional argument for setting the seed for random permutations of the resampling procedure.
#' If left NULL (the default), the exact same P-values will be found for repeated runs of the analysis (with the same number of iterations).
#' If seed = "random", a random seed will be used, and P-values will vary. One can also specify an integer for specific seed values,
#' which might be of interest for advanced users.
#' @param data A data frame for the function environment, see \code{\link{geomorph.data.frame}}
#' @param print.progress A logical value to indicate whether a progress bar should be printed to the screen.
#' This is helpful for long-running analyses.
#' @param ... Arguments passed on to procD.lm (typically associated with the lm function,
#' such as weights or offset, plus SS.type).
#' @keywords analysis
#' @export
#' @author Emma Sherratt and Michael Collyer
#' @return Objects of class "morphol.disparity" return a list with the following components
#' (if groups are specified):
#' \item{Procrustes.var}{Observed Procrustes variances.}
#' \item{PV.dist}{Observed pairwise absolute differences (as distances)
#' among group Procrustes variances.}
#' \item{PV.dist.Pval}{P-values associated with pairwise differences.}
#' \item{random.PV.dist}{Pairwise distance matrices produced in the resampling procedure.}
#' \item{permutations}{Number of random permutations in resampling procedure.}
#' \item{partial}{Logical value to indicate if partial disparities were calculated.}
#' \item{call}{The match call}
#'
#' @references Zelditch, M. L., D. L. Swiderski, H. D. Sheets, and W. L. Fink. 2012. Geometric morphometrics
#' for biologists: a primer. 2nd edition. Elsevier/Academic Press, Amsterdam.
#' @references Foote, M. 1993. Contributions of individual taxa to overall morphological disparity.
#' Paleobiology, 19: 403-419.
#' @references Collyer, M.L and D.C. Adams, 2021. Phylogenetically Aligned component analysis.
#' Methods in Ecology and Evolution, 12: 369-372.
#'
#' @examples
#' \dontrun{
#'
#' data(plethodon)
#' Y.gpa <- gpagen(plethodon$land, print.progress = FALSE) #GPA-alignment
#' gdf <- geomorph.data.frame(Y.gpa, species = plethodon$species,
#' site = plethodon$site)
#'
#' # Morphological disparity for entire data set
#' morphol.disparity(coords ~ 1, groups = NULL, data = gdf,
#' print.progress = FALSE)
#'
#' # Morphological disparity for entire data set, accounting for allometry
#' morphol.disparity(coords ~ Csize, groups= NULL, data = gdf,
#' print.progress = FALSE)
#'
#' # Morphological disparity without covariates, using overall mean
#' morphol.disparity(coords ~ 1, groups= ~ species * site, data = gdf,
#' print.progress = FALSE)
#'
#' # Morphological partial disparities for overal mean
#' morphol.disparity(coords ~ 1, groups= ~ species * site, partial = TRUE,
#' data = gdf, print.progress = FALSE)
#'
#' # Morphological disparity without covariates, using group means
#' morphol.disparity(coords ~ species * site, groups= ~species * site,
#' data = gdf, print.progress = FALSE)
#'
#' # Morphological disparity of different groups than those
#' # described by the linear model
#' morphol.disparity(coords ~ Csize + species * site, groups= ~ species,
#' data = gdf, print.progress = FALSE)
#'
#' # Extracting components
#' MD <- morphol.disparity(coords ~ Csize + species * site, groups= ~ species,
#' data = gdf, print.progress = FALSE)
#' MD$Procrustes.var # just the Procrustes variances
#'
#'
#' ### Morphol.disparity can be used with previously-defined
#' ### procD.lm or lm.rrpp class objects
#'
#' data(plethspecies)
#' Y.gpa <- gpagen(plethspecies$land) #GPA-alignment
#' gp.end <- factor(c(0,0,1,0,0,1,1,0,0)) #endangered species vs. rest
#' names(gp.end) <- plethspecies$phy$tip
#'
#' gdf <- geomorph.data.frame(Y.gpa, phy = plethspecies$phy,
#' gp.end = gp.end)
#'
#' pleth.ols <- procD.lm(coords ~ Csize + gp.end,
#' data = gdf) # ordinary least squares
#' pleth.pgls <- procD.pgls(coords ~ Csize + gp.end, phy = phy,
#' data = gdf) # phylogenetic generalized least squares
#'
#' summary(pleth.ols)
#' summary(pleth.pgls)
#'
#' morphol.disparity(f1 = pleth.ols, groups = ~ gp.end, data = gdf,
#' print.progress = FALSE)
#' morphol.disparity(f1 = pleth.pgls, groups = ~ gp.end,
#' transform = FALSE, data = gdf,
#' print.progress = FALSE) # disparity in tangent space
#' morphol.disparity(f1 = pleth.pgls, groups = ~ gp.end,
#' transform = TRUE, data = gdf,
#' print.progress = FALSE) # disparity in transformed space
#'
#' # Three plots that correspond to the three tests
#' PCA <- gm.prcomp(Y.gpa$coords, phy = plethspecies$phy)
#' pPCA <- gm.prcomp(Y.gpa$coords, phy = plethspecies$phy,
#' GLS = TRUE, transform = FALSE)
#' tpPCA <- gm.prcomp(Y.gpa$coords, phy = plethspecies$phy,
#' GLS = TRUE, transform = TRUE)
#'
#' par(mfrow = c(1,3))
#'
#' # Phylomorphospace
#' PC.plot <- plot(PCA, pch = 19, phylo = TRUE, main = "PCA-OLS")
#' shapeHulls(PC.plot, groups = gp.end)
#'
#' # Phylo-PCA
#' pPC.plot <- plot(pPCA, pch = 19, phylo = TRUE, main = "pPCA - GLS, not transformed")
#' shapeHulls(pPC.plot, groups = gp.end)
#'
#' # Transformed phylo-PCA
#' tpPC.plot <- plot(tpPCA, pch = 19, phylo = TRUE, main = "tpPCA - GLS, transformed")
#' shapeHulls(tpPC.plot, groups = gp.end)
#'
#' par(mfrow = c(1,1))
#'
#' ### Variance using RRPP (not necessarily the same as disparity)
#' PW <- pairwise(pleth.ols, groups = gp.end)
#' summary(PW, test.type = 'var')
#' PW2 <- pairwise(pleth.pgls, groups = gp.end)
#' summary(PW2, test.type = 'var')
#' }
morphol.disparity <- function(f1, groups = NULL, partial = FALSE, transform = FALSE,
iter = 999, seed = NULL,
data = NULL, print.progress = TRUE, ...){
if(inherits(f1, "lm")) {
R <- as.matrix(lm$residuals)
form <- formula(f1)
Terms <- terms(form)
df <- f1$model
k <- length(attr(Terms, "term.labels"))
if(k == 0) int.model <- TRUE else
int.model = FALSE
}
if(inherits(f1, "lm.rrpp")) {
if(f1$LM$gls) R <- as.matrix(f1$LM$gls.residuals) else
R <- as.matrix(f1$LM$residuals)
df <- f1$LM$data
form <- f1$LM$form
Terms <- f1$LM$Terms
k <- length(attr(Terms, "term.labels"))
if(k == 0) int.model <- TRUE else
int.model = FALSE
Pcov <- if(f1$LM$gls) f1$LM$Pcov else NULL
if(f1$LM$gls && is.null(f1$LM$Pcov) &&!is.null(f1$LM$Cov))
Pcov <- Cov.proj(f1$LM$Cov)
}
if(inherits(f1, "formula")) {
form <- f1
Terms <- terms(form)
k <- length(attr(Terms, "term.labels"))
if(k == 0) int.model <- TRUE else
int.model = FALSE
fit <- procD.lm(form, data = data, seed = seed, print.progress = print.progress, iter = 0, ...)
if(fit$LM$gls) R <- as.matrix(fit$LM$gls.residuals) else
R <- as.matrix(fit$LM$residuals)
df <- fit$LM$data
}
if(!inherits(f1, c("lm", "lm.rrpp", "formula")))
stop("f1 is neither a formula, lm object, nor lm.rrpp object.\n", call. = FALSE)
d <- rowSums(R^2)
N <- length(d)
if(inherits(f1, "lm") || inherits(f1, "lm.rrpp")){
if(is.null(groups) && !int.model) {
if(length(attr(Terms, "term.labels") > 0)) {
cat("\n *** Attempting to define groups from terms in the model fit.",
"If results are peculiar, define groups precisely and check model formula.)\n\n")
fac.check <- which(sapply(df, is.factor))
if(length(fac.check) == 0) groups <- NULL else {
fac.dat <- as.matrix(df[fac.check])
groups <- factor(apply(fac.dat, 1, function(x) paste(x, collapse = ".")))
}
}
}
}
if(!is.null(groups) && inherits(groups, "formula")){
if(length(groups) > 2) groups <- groups[-2]
gp.var <- all.vars(groups)
gps <- try(mget(gp.var, envir = as.environment(df)), silent = TRUE)
if(inherits(gps, "try-error"))
gps <- try(mget(gp.var, envir = as.environment(data)), silent = TRUE)
if(inherits(gps, "try-error"))
gps <- try(mget(gp.var, envir = parent.frame()), silent = TRUE)
if(inherits(gps, "try-error")) stop("Cannot find groups either in data frame or global environment.\n",
call. = FALSE)
groups <- factor(apply(as.data.frame(gps), 1, function(x) paste(x, collapse = ".")))
}
if(!is.null(groups)) {
if(!is.factor(groups) && !is.vector(groups))
stop("Groups must be a factor, a vector coercible to factor, or a formula.\n", call. = FALSE)
}
if(!is.null(groups)) groups <- as.factor(groups)
if(!int.model && partial)
stop("\n It is not possible to calculate partial disparities unless the model formula contains only an intercept; e.g., coords ~ 1\n")
if(is.null(groups)) {
pv <- sum(d)/N
if(partial) pv <- pv * N / (N - 1)
} else {
if(print.progress) {
cat("\n\nPerformimg pairwise comparisons of disparity\n")
pb <- txtProgressBar(min = 0, max = iter+1, initial = 0, style=3)
}
newDf <- data.frame(d = d, groups = groups)
fit <- lm.rrpp(d ~ groups + 0, iter = 0, data = newDf, print.progress = FALSE,
seed = seed, ...)
Q <- getModels(fit, "qr")
Q <- Q$full[[length(Q$full)]]
if(transform){
if(!is.null(Pcov)) {
Q <- QRforX(Pcov %*% fit$LM$X, reduce = FALSE)
}
}
H <- tcrossprod(solve(Q$R), Q$Q)
ind <- perm.index(N, iter, seed)
pv <- sapply(1:(iter + 1), function(j){
step <- j
if(print.progress) setTxtProgressBar(pb,step)
as.matrix(H %*% d[ind[[j]]])
})
if(print.progress) close(pb)
if(partial){
gpn <- as.vector(by(groups, groups, length))
part.mat <- matrix(gpn, length(gpn), NCOL(pv))/(N - 1)
pv <- pv * part.mat
}
}
if(!is.null(groups)) {
pvd <- lapply(1:(iter+1), function(j) as.matrix(dist(matrix(pv[, j]))))
for(i in 1:(iter + 1)) dimnames(pvd[[i]]) <- list(rownames(pv), rownames(pv))
p.val <- Reduce("+", lapply(1:(iter+1), function(j){
x <- pvd[[1]]
y <- pvd[[j]]
ifelse(y >= x, 1, 0)
}))/(iter + 1)
} else {
pvd <- p.val <- permutations <- NULL
}
if(is.null(groups)) {
pv.obs <- pv
cat("No factor in formula or model terms from which to define groups.\n")
cat("Procrustes variance")
if(partial) cat("(Foote's disparity)")
cat(":\n")
cat(pv, "\n")
return(pv.obs)
} else {
PV.dist = pvd[[1]]
dimnames(PV.dist) <- dimnames(p.val) <- list(levels(groups), levels(groups))
pv.obs <- pv[,1]
names(pv.obs) <- levels(groups)
out <- list(Procrustes.var = pv.obs, PV.dist = PV.dist, PV.dist.Pval = p.val,
random.PV.dist = pvd, permutations = iter+1,
partial = partial, call = match.call())
class(out) <-"morphol.disparity"
out
}
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.