details_k_means_clustMixType: K-means via clustMixType
In tidyclust: A Common API to Clustering

details_k_means_clustMixType

R Documentation

K-means via clustMixType

Description

k_means() creates K-prototypes model. A K-prototypes is the middle ground between a K-means and K-modes model, in the sense that it can be used with data that contains both numeric and categorical predictors.

Details

Both numeric and categorical predictors are requires for this engine.

For this engine, there is a single mode: partition

Tuning Parameters

This model has 1 tuning parameters:

num_clusters: # Clusters (type: integer, default: no default)

Translation from tidyclust to the original package (partition)

k_means(num_clusters = integer(1)) %>% 
  set_engine("clustMixType") %>% 
  set_mode("partition") %>% 
  translate_tidyclust()

## K Means Cluster Specification (partition)
## 
## Main Arguments:
##   num_clusters = integer(1)
## 
## Computational engine: clustMixType 
## 
## Model fit template:
## tidyclust::.k_means_fit_clustMixType(x = missing_arg(), k = missing_arg(), 
##     keep.data = missing_arg(), k = integer(1), keep.data = TRUE, 
##     verbose = FALSE)

Preprocessing requirements

Both categorical and numeric predictors are required.

References

Szepannek, G. (2018): clustMixType: User-Friendly Clustering of Mixed-Type Data in R, The R Journal 10/2, 200-208, doi:10.32614/RJ-2018-048.
Aschenbruck, R., Szepannek, G., Wilhelm, A. (2022): Imputation Strategies for Clustering Mixed‑Type Data with Missing Values, Journal of Classification, doi:10.1007/s00357-022-09422-y.
Z.Huang (1998): Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Variables, Data Mining and Knowledge Discovery 2, 283-304.