cor_cramer_v | R Documentation |
Computes bias-corrected Cramer's V (extension of the chi-squared test), a measure of association between two categorical variables. Results are in the range 0-1, where 0 indicates no association, and 1 indicates a perfect association.
In essence, Cramer's V assesses the co-occurrence of the categories of two variables to quantify how strongly these variables are related.
Even when its range is between 0 and 1, Cramer's V values are not directly comparable to R-squared values, and as such, a multicollinearity analysis containing both types of values must be assessed with care. It is probably preferable to convert non-numeric variables to numeric using target encoding rather before a multicollinearity analysis.
cor_cramer_v(x = NULL, y = NULL, check_input = TRUE)
x |
(required; character vector) character vector representing a categorical variable. Default: NULL |
y |
(required; character vector) character vector representing a categorical variable. Must have the same length as 'x'. Default: NULL |
check_input |
(required; logical) If FALSE, disables data checking for a slightly faster execution. Default: TRUE |
numeric: Cramer's V
Blas M. Benito, PhD
Cramér, H. (1946). Mathematical Methods of Statistics. Princeton: Princeton University Press, page 282 (Chapter 21. The two-dimensional case). ISBN 0-691-08004-6
Other pairwise_correlation:
cor_clusters()
,
cor_df()
,
cor_matrix()
,
cor_select()
#loading example data
data(vi)
#subset to limit example run time
vi <- vi[1:1000, ]
#computing Cramer's V for two categorical predictors
v <- cor_cramer_v(
x = vi$soil_type,
y = vi$koppen_zone
)
v
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.