knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

As can probably(hopefully) be guessed from the name, this provides a convenient way to get variable correlations. It enables one to get correlation between one variable and all other variables in the data set.

Previously, one would set get_all to TRUE if they wanted to get correlations between all variables. This argument has been dropped in favor of simply supplying an optional other_vars vector if one does not want to get all correlations.

library(manymodelr)
# getall correlations

# default pearson

head( corrs <- get_var_corr(mtcars,comparison_var="mpg") )

Previously, one would also set drop_columns to TRUE if they wanted to drop factor columns. Now, a user simply provides a character vector specifying which column types(classes) should be dropped. It defaults to c("character","factor").

data("yields", package="manymodelr")
# purely demonstrative
get_var_corr(yields,"height",other_vars="weight",
             drop_columns=c("factor","character"),method="spearman",
             exact=FALSE)

Similarly, get_var_corr_ (note the underscore at the end) provides a convenient way to get combination-wise correlations.

head(get_var_corr_(yields),6)

To use only a subset of the data, we can use provide a list of columns to subset_cols. By default, the first value(vector) in the list is mapped to comparison_var and the other to other_Var. The list is therefore of length 2.

head(get_var_corr_(mtcars,subset_cols=list(c("mpg","vs"),c("disp","wt")),
                   method="spearman",exact=FALSE))

Obtaining correlations would mostly likely benefit from some form of visualization. plot_corr aims to achieve just that. There are currently two plot styles, squares and circles. circles has a shape argument that can allow for more flexibility. It should be noted that the correlation matrix supplied to this function is an object produced by get_var_corr_.

To modify the plot a bit, we can choose to switch the x and y values as shown below.

plot_corr(mtcars,show_which = "corr",
          round_which = "correlation",decimals = 2,x="other_var",  y="comparison_var",plot_style = "squares"
          ,width = 1.1,custom_cols = c("green","blue","red"),colour_by = "correlation")

To show significance of the results instead of the correlations themselves, we can set show_which to "signif" as shown below. By default, significance is set to 0.05. You can override this by supplying a different signif_cutoff.

# color by p value
# change custom colors by supplying custom_cols
# significance is default 
set.seed(233)
plot_corr(mtcars, x="other_var", y="comparison_var",plot_style = "circles",show_which = "signif", colour_by = "p.value", sample(colours(),3))

To explore more options, please take a look at the documentation.



Try the manymodelr package in your browser

Any scripts or data that you put into this service are public.

manymodelr documentation built on April 4, 2025, 12:01 a.m.