colindiag | R Documentation |
Perform a (multi)collinearity diagnostic of a correlation matrix of predictor variables using several indicators, as shown by Olivoto et al. (2017).
colindiag(.data, ..., by = NULL, n = NULL)
.data |
The data to be analyzed. It must be a symmetric correlation
matrix, or a data frame, possible with grouped data passed from
|
... |
Variables to use in the correlation. If |
by |
One variable (factor) to compute the function by. It is a shortcut
to |
n |
If a correlation matrix is provided, then |
If .data
is a grouped data passed from dplyr::group_by()
then the results will be returned into a list-column of data frames.
cormat A symmetric Pearson's coefficient correlation matrix between the variables
corlist A hypothesis testing for each of the correlation coefficients
evalevet The eigenvalues with associated eigenvectors of the correlation matrix
indicators A data.frame
with the following indicators
VIF
The Variance Inflation Factors, being the diagonal elements of
the inverse of the correlation matrix.
cn
The Condition Number of the correlation matrix, given by the
ratio between the largest and smallest eigenvalue.
det
The determinant of the correlation matrix.
ncorhigh
Number of correlation greather than |0.8|.
largest_corr
The largest correlation (in absolute value) observed.
smallest_corr
The smallest correlation (in absolute value)
observed.
weight_var
The variables with largest eigenvector (largest weight)
in the eigenvalue of smallest value, sorted in decreasing order.
Tiago Olivoto tiagoolivoto@gmail.com
Olivoto, T., V.Q. Souza, M. Nardino, I.R. Carvalho, M. Ferrari, A.J. Pelegrin, V.J. Szareski, and D. Schmidt. 2017. Multicollinearity in path analysis: a simple method to reduce its effects. Agron. J. 109:131-142. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.2134/agronj2016.04.0196")}
# Using the correlation matrix
library(metan)
cor_iris <- cor(iris[,1:4])
n <- nrow(iris)
col_diag <- colindiag(cor_iris, n = n)
# Using a data frame
col_diag_gen <- data_ge2 %>%
group_by(GEN) %>%
colindiag()
# Diagnostic by levels of a factor
# For variables with "N" in variable name
col_diag_gen <- data_ge2 %>%
group_by(GEN) %>%
colindiag(contains("N"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.