ordinate: Ordination tool for data aligned to another matrix

View source: R/ordinate.r

ordinateR Documentation

Ordination tool for data aligned to another matrix

Description

Function performs a singular value decomposition of ordinary least squares (OLS) or generalized least squares (GLS) residuals, aligned to an alternative matrix, plus projection of data onto vectors obtained.

Usage

ordinate(
  Y,
  A = NULL,
  Cov = NULL,
  transform. = TRUE,
  scale. = FALSE,
  tol = NULL,
  rank. = NULL,
  newdata = NULL
)

Arguments

Y

An n x p data matrix.

A

An optional n x n symmetric matrix or an n x k data matrix, where k is the number of variables that could be associated with the p variables of Y. If NULL, an n x n identity matrix will be used.

Cov

An optional n x n covariance matrix to describe the non-independence among observations in Y, and provide a GLS-centering of data. Note that Cov and A can be the same, if one wishes to align GLS residuals to the same matrix used to obtain them. Note also that no explicit GLS-centering is performed on A. If this is desired, A should be GLS-centered beforehand.

transform.

An optional argument if a covariance matrix is provided to transform GLS-centered residuals, if TRUE. If FALSE, only GLS-centering is performed. Only if transform = TRUE (the default) can one expect the variances of ordinate scores in a principal component analysis to match eigenvalues.

scale.

a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place. The default is FALSE.

tol

A value indicating the magnitude below which components should be omitted. (Components are omitted if their standard deviations are less than or equal to tol times the standard deviation of the first component.) With the default null setting, no components are omitted (unless rank. is provided). Other settings for tol could be tol = sqrt(.Machine$double.eps), which would omit essentially constant components, or tol = 0, to retain all components, even if redundant. This argument is exactly the same as in prcomp

rank.

Optionally, a number specifying the maximal rank, i.e., maximal number of aligned components to be used. This argument can be set as alternative or in addition to tol, useful notably when the desired rank is considerably smaller than the dimensions of the matrix. This argument is exactly the same as in prcomp

newdata

An optional data frame of values for the same variables of Y to be projected onto aligned components. This is only possible with OLS (transform. = FALSE).

Details

The function performs a singular value decomposition, t(A)Z = UDt(V), where Z is a matrix of residuals (obtained from Y; see below) and A is an alignment matrix with the same number of rows as Z. (t indicates matrix transposition.) U and V are the matrices of left and right singular vectors, and D is a diagonal matrix of singular values. V are the vectors that describe maximized covariation between Y and A. If A = I, an n x n identity matrix, V are the eigen vectors (principal components) of Y.

Z represents a centered and potentially standardized form of Y. This function can center data via OLS or GLS means (the latter if a covariance matrix to describe the non-independence among observations is provided). If standardizing variables is preferred, then Z both centers and scales the vectors of Y by their standard deviations.

Data are projected onto aligned vectors, ZV. If a GLS computation is made, the option to transform centered values (residuals) before projection is available. This is required for orthogonal projection, but from a transformed data space. Not transforming residuals maintains the Euclidean distances among observations and the OLS multivariate variance, but the projection is oblique (scores can be correlated).

The versatility of using an alignment approach is that alternative data space rotations are possible. Principal components are thus the vectors that maximize variance with respect to the data, themselves, but "components" of (co)variation can be described for any inter-matrix relationship, including phylogenetic signal, ecological signal, ontogenetic signal, size allometry, etc. More details are provided in Collyer and Adams (2021).

Much of this function is consistent with the prcomp function, except that centering data is not an option (it is required).

SUMMARY STATISTICS: For principal component plots, the traditional statistics to summarize the analysis include eigenvalues (variance by component), proportion of variance by component, and cumulative proportion of variance. When data are aligned to an alternative matrix, the statistics are less straightforward. A summary of of such an analysis (performed with summary.ordinate) will produce these additional statistics:

  • Singular Value Rather than eigenvalues, the singular values from singular value decomposition of the cross-product of the scaled alignment matrix and the data.

  • Proportion of Covariance Each component's singular value divided by the sum of singular values. The cumulative proportion is also returned. Note that these values do not explain the amount of covariance between the alignment matrix and data, but explain the distribution of the covariance. Large proportions can be misleading.

  • RV by Component The partial RV statistic by component. Cumulative values are also returned. The sum of partial RVs is Escoffier's RV statistic, which measures the amount of covariation between the alignment matrix and data. Caution should be used in interpreting these values, which can vary with the number of observations and number of variables. However, the RV is more reliable than proportion of singular value for interpretation of the strength of linear association for aligned components. (It is most analogous to proportion of variance for principal components.)

Value

An object of class ordinate is a list containing the following

x

Aligned component scores for all observations

xn

Optional projection of new data onto components.

d

The portion of the squared singular values attributed to the aligned components.

sdev

Standard deviations of d; i.e., the scale of the components.

rot

The matrix of variable loadings, i.e. the singular vectors, V.

center

The OLS or GLS means vector used for centering.

transform

Whether GLS transformation was used in projection of residuals (only possible in conjunction with GLS-centering).

scale

The scaling used, or FALSE.

alignment

Whether data were aligned to principal axes or the name of another matrix.

GLS

A logical value to indicate if GLS-centering and projection was used.

Author(s)

Michael Collyer

References

Collyer, M.L. and D.C. Adams. 2021. Phylogenetically-aligned Component Analysis. Methods in Ecology and evolution. In press.

Revell, L. J. 2009. Size-correction and principal components for interspecific comparative studies. Evolution, 63:3258-3268.

See Also

plot.ordinate, prcomp, plot.default, gm.prcomp within geomorph

Examples


# Examples use residuals from a regression of salamander 
# morphological traits against body size (snout to vent length, SVL).
# Observations are species means and a phylogenetic covariance matrix
# describes the relatedness among observations.

data("PlethMorph")
Y <- as.data.frame(PlethMorph[c("TailLength", "HeadLength", 
"Snout.eye", "BodyWidth", 
"Forelimb", "Hindlimb")])
Y <- as.matrix(Y)
R <- lm.rrpp(Y ~ SVL, data = PlethMorph, 
iter = 0, print.progress = FALSE)$LM$residuals

# PCA (on correlation matrix)

PCA.ols <- ordinate(R, scale. = TRUE)
PCA.ols$rot
prcomp(R, scale. = TRUE)$rotation # should be the same

# phyPCA (sensu Revell, 2009)
# with projection of untransformed residuals (Collyer & Adams 2020)

PCA.gls <- ordinate(R, scale. = TRUE, 
transform. = FALSE, 
Cov = PlethMorph$PhyCov)

# phyPCA with transformed residuals (orthogonal projection, 
# Collyer & Adams 2020)

PCA.t.gls <- ordinate(R, scale. = TRUE, 
transform. = TRUE, 
Cov = PlethMorph$PhyCov)
 
 # Align to phylogenetic signal (in each case)
 
 PaCA.ols <- ordinate(R, A = PlethMorph$PhyCov, scale. = TRUE)
 
 PaCA.gls <- ordinate(R, A = PlethMorph$PhyCov, 
 scale. = TRUE,
 transform. = FALSE, 
 Cov = PlethMorph$PhyCov)
 
 PaCA.t.gls <- ordinate(R, A = PlethMorph$PhyCov, 
 scale. = TRUE,
 transform. = TRUE, 
 Cov = PlethMorph$PhyCov)
 
 # Summaries
 
 summary(PCA.ols)
 summary(PCA.gls)
 summary(PCA.t.gls)
 summary(PaCA.ols)
 summary(PaCA.gls)
 summary(PaCA.t.gls)
 
 # Plots
 
 par(mfrow = c(2,3))
 plot(PCA.ols, main = "PCA OLS")
 plot(PCA.gls, main = "PCA GLS")
 plot(PCA.t.gls, main = "PCA t-GLS")
 plot(PaCA.ols, main = "PaCA OLS")
 plot(PaCA.gls, main = "PaCA GLS")
 plot(PaCA.t.gls, main = "PaCA t-GLS")
 par(mfrow = c(1,1))
 
 # Changing some plot aesthetics (the arguments in plot.ordinate and 
 # plot.default are important for changing plot parameters)
 
 P1 <- plot(PaCA.gls, main = "PaCA GLS", include.axes = TRUE)
 
 P2 <- plot(PaCA.gls, main = "PaCA GLS", include.axes = TRUE, 
 frame.plot = FALSE, col = 4, pch = 21, bg = PlethMorph$group)
 add.tree(P2, PlethMorph$tree, edge.col = 4)

 P3 <- plot(PaCA.gls, main = "PaCA GLS", include.axes = TRUE, 
 frame.plot = FALSE, col = 4, pch = 21, bg = PlethMorph$group,
 flip = 1)
 add.tree(P3, PlethMorph$tree, edge.col = 4)
 
 

RRPP documentation built on Aug. 16, 2023, 1:06 a.m.