betaTest | R Documentation |
For any lm.rrpp
object, a vector of coefficients
can be used for a specific test of a vector of betas (specific population parameters).
This test follows the form of (b - beta) in the numerator of a t-statistic, where
beta can be a value other than 0 (or 0). However, for this test, a vector (Beta) of length, p,
is used for the p variables in the lm.rrpp
fit. If Beta is a vector of 0s, this
test is essentially the same as the test performed for coef.lm.rrpp
. However,
it is possible to test null hypotheses for beta values other than 0, sensu Cicuéndez et al. (2023).
This function can use either the square-root of the inner-product of vectors of coefficients (distance, d) or generalized inner-product based on the inverse of the residual covariance matrix (Mahalanobis distance, md) as statistics. In most cases, either will likely yield similar (or same) P-values. However, Mahalanobis distance might be preferred for generalized least squares fits, which do not have consistent residual covariance matrices for null (intercept only) models over RRPP permutations (the distances are thus standardized by the residual covariances). If high-dimensional data are analyzed, a generalized inverse of the residual covariance matrix will be used because of singular covariance matrices. Results are less trustworthy with Mahalanobis distances, in these cases.
The coefficient number should be provided for specific tests. One can determine this with, e.g.,
coef(fit). If it is not provided (NULL), tests will be performed on all possible vectors of coefficients
(rows of coefficients matrix). These tests will be performed sequentially. If a null model is not specified,
then for each vector of coefficients, the corresponding parameter is dropped from the linear model
design matrix to make a null model.
This process is analogous in some ways to a leave-one-out
cross-validation (LOOCV) analysis, testing each coefficient against models containing parameters for all other
coefficients. For example, for a linear model fit, y ~ x1 + x2 + 1, where x1 and x2 are single-parameter
covariates,
the analysis would first drop the intercept, then x1, then x2, performing three sequential analyses. This
option could require large amounts of computation time for large models, high-dimensional data, many RRPP
permutations, or any combination of these.
The test results previously reported via coef.lm.rrpp can be found using X.null.
One would have to be cognizant of the null model used for each coefficient, based on
which term it represents. The function, reveal.model.designs
could help determine
terms to include in a null model. Regardless, such tests have to be performed iteratively now,
but do not require verbose results for initial lm.rrpp fits.
The test for coef.lm.rrpp uses the square-root of inner-products of vectors (d) as a
test statistic and only tests the null hypothesis that the length of the vector is 0.
The significance of the test is based on random values produced by RRPP, based on the
matrices of coefficients that are produced in all permutations. The null models for generating
RRPP distributions are consistent with those used for ANOVA, as specified in the
lm.rrpp
fit by choice of SS type. Therefore, the random coefficients are
consistent with those produced by RRPP for generating random estimates used in ANOVA.
The betaTest analysis allows different null hypotheses to be used (vector length is not necessarily 0) and unless otherwise specified, uses a null model that lacks one vector of parameters and a full model that contains all vectors of parameters, for the parameter for which coefficients are estimated. This is closest to a type III SS method of estimation, but each parameter is dropped from the model, rather than terms potentially comprising several parameters. Additionally, betaTest calculates Mahalanobis distance, in addition to Euclidean distance, for vectors of coefficients. This statistic is probably better for more types of models (like generalized least squares fits).
If data are high-dimensional (more variables than observations), or even highly multivariate,
using Mahalanobis distance can require significant computation time and will require
using a generalized inverse. One might wish to consider first whether using principal component
scores or other ordinate scores could achieve the same goal. (See ordinate
.)
For example, one could use the first few principal components as a surrogate for a high-dimensional
trait, and test whether the surrogate trait is different than Beta. This would require that
the PC scores make sense compared to the original variables, but it would be more
computationally tractable.
To the extent that is possible, tests for GLS estimated coefficients should use Mahalanobis distance. The reason is that the covariance matrix for the data (not to be confused with the residual covariance matrix of a linear model) might not be consistent across RRPP permutations. To assure that random distances are comparable in terms of scale, a generalized (Mahalanobis) distance is safer. However, this can impose a computational burden for high-dimensional data (see above).
betaTest(
fit,
X.null = NULL,
include.md = FALSE,
coef.no = NULL,
Beta = NULL,
print.progress = FALSE
)
fit |
Object from |
X.null |
Optional object that is either a linear model design matrix or a model
fit from |
include.md |
A logical vector for whether to include Mahalanobis distances in the results. For highly multivariate data, this will slow down computations, significantly. |
coef.no |
The row or rows of a matrix of coefficients for which to perform the test. This can be learned by performing coef(fit), prior to the test. If left NULL, the analysis will cycle through every possible vector of coefficients (rows of a coefficients matrix). |
Beta |
A single value (for univariate data) or a numeric vector with length equal to the number of variables used in the fit object. If left NULL, 0 is used for each parameter. This should not be a matrix. If one wishes to use different Beta vectors for different coefficients, then multiple tests should be performed. (Because tests are performed sequentially, multiple tests using the same Beta vector produces results that are the same as for multiple rows of coefficients, using the same Beta vector.) |
print.progress |
A logical value for whether to print test progress to the screen. This might be useful if a large number of coefficient vectors are tested, so that one can track completion. |
Function returns a list with the following components:
obs.d |
Length of observed b - Beta vector |
obs.md |
The observed b - Beta vector length, after accounting for residual covariance matrix; the Mahalanobis distance |
Beta |
Hypothesized beta values in the Beta vector. |
obs.B.mat |
The observed matrix of coefficients (before subtracting Beta). |
coef.no |
The rows of the observed matrix of coefficients, for which to subtract Beta. |
random.stats |
Random distances produced with RRPP. |
Michael Collyer
Tejero-Cicuéndez, H., I. Menéndez, A. Talavera, G. Riaño, B. Burriel-Carranza, M. Simó-Riudalbas, S. Carranza, and D.C. Adams. 2023. Evolution along allometric lines of least resistance: Morphological differentiation in Pristurus geckos. Evolution. 77:2547–2560.
coef.lm.rrpp
## Not run:
data(PlethMorph)
fit <- lm.rrpp(TailLength ~ SVL,
data = PlethMorph,
verbose = TRUE)
## Allometry test (Beta = 0)
T1 <- betaTest(fit, coef.no = 2, Beta = 0)
summary(T1)
# Including Mahalanobis distance
T1 <- betaTest(fit, coef.no = 2,
Beta = 0, include.md = TRUE)
summary(T1)
# compare to
coef(fit, test = TRUE)
# Note that if Beta is not provided
T1 <- betaTest(fit, coef.no = 2)
summary(T1)
# Note that if coef.no is not provided
T1 <- betaTest(fit)
summary(T1)
# Note that if X.null is provided
T1 <- betaTest(fit, X.null = model.matrix(fit)[, 1],
coef.no = 2)
summary(T1)
## Isometry test (Beta = 1)
# Failure to reject H0 suggests isometric-like association.
T2 <- betaTest(fit, coef.no = 2, Beta = 1)
summary(T2)
## More complex tests
# Multiple covariates
fit2 <- lm.rrpp(HeadLength ~ SVL + TailLength,
data = PlethMorph,
SS.type = "II",
verbose = TRUE)
fit.null1 <- lm.rrpp(HeadLength ~ SVL,
data = PlethMorph,
verbose = TRUE)
fit.null2 <- lm.rrpp(HeadLength ~ TailLength,
data = PlethMorph,
verbose = TRUE)
## allometries
T3 <- betaTest(fit2, fit.null2, coef.no = 2, Beta = 0)
T4 <- betaTest(fit2, fit.null1, coef.no = 3, Beta = 0)
summary(T3)
summary(T4)
# compare to
coef(fit2, test = TRUE)
## isometries
T5 <- betaTest(fit2, fit.null2, coef.no = 2, Beta = 1)
T6 <- betaTest(fit2, fit.null1, coef.no = 3, Beta = 1)
summary(T5)
summary(T6)
# Intercept test
T7 <- betaTest(fit2, fit.null1, coef.no = 1)
summary(T7)
# multivariate data
PlethMorph$Y <- cbind(PlethMorph$HeadLength, PlethMorph$TailLength)
fit3 <- lm.rrpp(Y ~ SVL,
data = PlethMorph,
verbose = TRUE)
T8 <- betaTest(fit3, coef.no = 2, Beta = c(0, 0))
T9 <- betaTest(fit3, coef.no = 2, Beta = c(1, 1))
summary(T8)
summary(T9)
## GLS example
fit4 <- lm.rrpp(TailLength ~ SVL,
data = PlethMorph,
Cov = PlethMorph$PhyCov,
verbose = TRUE)
T10 <- betaTest(fit4, include.md = TRUE)
summary(T10)
# compare to
coef(fit4, test = TRUE)
anova(fit4)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.