details_k_means_stats | R Documentation |
k_means()
creates K-means model. This engine uses the classical definition
of a K-means model, which only takes numeric predictors.
For this engine, there is a single mode: partition
This model has 1 tuning parameters:
num_clusters
: # Clusters (type: integer, default: no default)
k_means(num_clusters = integer(1)) %>% set_engine("stats") %>% set_mode("partition") %>% translate_tidyclust()
## K Means Cluster Specification (partition) ## ## Main Arguments: ## num_clusters = integer(1) ## ## Computational engine: stats ## ## Model fit template: ## tidyclust::.k_means_fit_stats(x = missing_arg(), centers = missing_arg(), ## centers = integer(1))
Factor/categorical predictors need to be converted to numeric values
(e.g., dummy or indicator variables) for this engine. When using the
formula method via fit()
, tidyclust
will convert factor columns to indicators.
Predictors should have the same scale. One way to achieve this is to center and scale each so that each predictor has mean zero and a variance of one.
Forgy, E. W. (1965). Cluster analysis of multivariate data: efficiency vs interpretability of classifications. Biometrics, 21, 768–769.
Hartigan, J. A. and Wong, M. A. (1979). Algorithm AS 136: A K-means clustering algorithm. Applied Statistics, 28, 100–108. doi:10.2307/2346830.
Lloyd, S. P. (1957, 1982). Least squares quantization in PCM. Technical Note, Bell Laboratories. Published in 1982 in IEEE Transactions on Information Theory, 28, 128–137.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, eds L. M. Le Cam & J. Neyman, 1, pp. 281–297. Berkeley, CA: University of California Press.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.