View source: R/kmeans_procedure.R
kmeans_procedure | R Documentation |
This function allows to perform k-means clustering with constrained on the size of clusters
kmeans_procedure( data, columns, threshold_min, threshold_max, verbose = FALSE, seed = 42 )
data |
a R data frame. |
columns |
a vector of columns names of the data frame on which we perform the kmeans algorithm. These features have to be numeric. |
threshold_min |
an integer. It represents the minimum size for cluster. |
threshold_max |
an integer. It represents the maximum size fo cluster. |
verbose |
a boolean. If set to TRUE print the current state of the procedure (by default set to FALSE). |
seed |
an integer. This represents the seed for the random call (if we want the output to be reproducible). |
a R data frame. This contains the id of the original data frame and a column 'cluster' representing the cluster to which the observation belongs to.
Simon CORDE
Link to the author's github package repository: https://github.com/Redcart/helda
library(dplyr) data <- iris %>% select(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) features <- colnames(data) result <- kmeans_procedure(data = data, columns = features, threshold_min = 2, threshold = 10, verbose=FALSE, seed=10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.