| cor_cramer | R Documentation |
Cramer's V extends the chi-squared test to quantify how strongly the categories of two variables co-occur. The value ranges from 0 to 1, where 0 indicates no association and 1 indicates perfect association.
This function implements a bias-corrected version of Cramer's V, which adjusts for sample size and is more accurate for small samples. However, this bias correction means that even for binary variables, Cramer's V will no equal the Pearson correlation (the standard, uncorrected Cramer's V does match Pearson for binary data).
As the number of categories increases, Cramer's V and Pearson correlation measure increasingly different aspects of association and should not be directly compared.
If you intend to combine these measures in a multicollinearity analysis, interpret them with care. It is often preferable to convert non-numeric variables to numeric form (for example, via target encoding) before assessing multicollinearity.
cor_cramer(x = NULL, y = NULL, check_input = TRUE, ...)
x |
(required; vector) Values of a categorical variable (character or vector). Converted to character if numeric or logical. Default: NULL |
y |
(required; vector) Values of a categorical variable (character or vector). Converted to character if numeric or logical. Default: NULL |
check_input |
(required; logical) If FALSE, disables data checking for a slightly faster execution. Default: TRUE |
... |
(optional) Internal args (e.g. |
numeric: Cramer's V
Blas M. Benito, PhD
Cramér, H. (1946). Mathematical Methods of Statistics. Princeton: Princeton University Press, page 282 (Chapter 21. The two-dimensional case). ISBN 0-691-08004-6
Other multicollinearity_assessment:
collinear_stats(),
cor_clusters(),
cor_df(),
cor_matrix(),
cor_stats(),
vif(),
vif_df(),
vif_stats()
# perfect one-to-one association
cor_cramer(
x = c("a", "a", "b", "c"),
y = c("a", "a", "b", "c")
)
# still perfect: labels differ but mapping is unique
cor_cramer(
x = c("a", "a", "b", "c"),
y = c("a", "a", "b", "d")
)
# high but < 1: mostly aligned, one category of y repeats
cor_cramer(
x = c("a", "a", "b", "c"),
y = c("a", "a", "b", "b")
)
# appears similar by position, but no association by distribution
# (x = "a" mixes with y = "a" and "b")
cor_cramer(
x = c("a", "a", "a", "c"),
y = c("a", "a", "b", "b")
)
# numeric inputs are coerced to character internally
cor_cramer(
x = c(1, 1, 2, 3),
y = c(1, 1, 2, 2)
)
# logical inputs are also coerced to character
cor_cramer(
x = c(TRUE, TRUE, FALSE, FALSE),
y = c(TRUE, TRUE, FALSE, FALSE)
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.