View source: R/model_selection.R
assoc_matrix | R Documentation |
Creates an association matrix based on pairwise measures of association between variables.
assoc_matrix( data, var_names = NULL, factor_vars = NULL, method = c("pearson", "spearman", "kendall", "eta_squared", "cramers_v"), use = c("pairwise.complete.obs", "everything", "all.obs", "complete.obs", "na.or.complete"), bias_correction = F )
data |
A data frame with columns from which to retrieve variables to compute associations. |
var_names |
A vector of variables names from the columns of |
factor_vars |
A vector that includes the names of variables that should be converted to
factors. Must be in |
method |
The type of association to calculate. One of |
use |
A string giving a method for computing covariances in the presence of missing
values. This must be one of the strings |
bias_correction |
A boolean indicating whether bias correction for Cramer's V should be
applied. Only relevant when |
Calculates pairwise associations for a set of variables. Depending on the measure
of association specified, variables will be excluded if the variable type
(e.g., nominal or continuous) does not make sense to include in the calculations. Cramer's V
will only be calculated between two nominal variables. Eta-squared will only be applied to
nominal-continuous variable pairs. Pearson, spearman, and kendall correlations exclude nominal
variables with >2 values. Variable exclusions are based on the variable type as defined in
data
, so these should be verified. Categorical variables with values coded as integers can be
mistakenly treated as continuous variables. Nominal variables should be of class factor
(not
character
). Numeric variables should be of class integer
or numeric
.
For Cramer's V calculations, nominal variables with many categories or in small sample size settings can inflate the strength of association. A bias correction can be applied as detailed in Bergsma (2013).
Bergsma, W. (2013). A bias-correction for Cramer's V and Tschuprow's T. Journal of Korean Statistical Society, 42(3), 323-238.
A data frame with association metric values.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.