cor_clusters | R Documentation |
Hierarchical clustering of predictors from their pairwise correlation matrix. Computes the correlation matrix with cor_df()
and cor_matrix()
, transforms it to a dist object, computes a clustering solution with stats::hclust()
, and applies stats::cutree()
to separate groups based on the value of the argument max_cor
.
Returns a data frame with predictor names and their clusters, and optionally, prints a dendrogram of the clustering solution.
Accepts a parallelization setup via future::plan()
and a progress bar via progressr::handlers()
(see examples).
cor_clusters(
df = NULL,
predictors = NULL,
max_cor = 0.75,
method = "complete",
plot = FALSE
)
df |
(required; data frame, tibble, or sf) A data frame with responses and predictors. Default: NULL. |
predictors |
(optional; character vector) Names of the predictors to select from |
max_cor |
(optional; numeric) Maximum correlation allowed between any pair of variables in |
method |
(optional, character string) Argument of |
plot |
(optional, logical) If TRUE, the clustering is plotted. Default: FALSE |
data frame: predictor names and their clusters
Other pairwise_correlation:
cor_cramer_v()
,
cor_df()
,
cor_matrix()
,
cor_select()
#parallelization setup
future::plan(
future::multisession,
workers = 2 #set to parallelly::availableCores() - 1
)
#progress bar
# progressr::handlers(global = TRUE)
df_clusters <- cor_clusters(
df = vi[1:1000, ],
predictors = vi_predictors[1:15]
)
#disable parallelization
future::plan(future::sequential)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.