Creates a specification of a recipe step that will partition numeric variables according to kmedoids clustering and select the cluster medoids.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  step_kmedoids(
recipe,
...,
k = 5,
center = TRUE,
scale = TRUE,
method = c("pam", "clara"),
metric = "euclidean",
optimize = FALSE,
num_samp = 50,
samp_size = 40 + 2 * k,
replace = TRUE,
prefix = "KMedoids",
role = "predictor",
skip = FALSE,
id = recipes::rand_id("kmedoids")
)
tunable.step_kmedoids(x, ...)

recipe 
recipe object to which the step will be added. 
... 
one or more selector functions to choose which variables will be
used to compute the components. See 
k 
number of kmedoids clusterings of the variables. The value of

center, scale 
logicals indicating whether to mean center and median absolute deviation scale the original variables prior to cluster partitioning, or functions or names of functions for the centering and scaling; not applied to selected variables. 
method 
character string specifying one of the clustering methods
provided by the cluster package. The 
metric 
character string specifying the distance metric for calculating
dissimilarities between observations as 
optimize 
logical indicator or 0:5 integer level specifying
optimization for the 
num_samp 
number of subdatasets to sample for the

samp_size 
number of cases to include in each subdataset. 
replace 
logical indicating whether to replace the original variables. 
prefix 
if the original variables are not replaced, the selected variables are added to the dataset with the character string prefix added to their names; otherwise, the original variable names are retained. 
role 
analysis role that added step variables should be assigned. By default, they are designated as model predictors. 
skip 
logical indicating whether to skip the step when the recipe is
baked. While all operations are baked when 
id 
unique character string to identify the step. 
x 

Kmedoids clustering partitions variables into k groups such that the dissimilarity between the variables and their assigned cluster medoids is minimized. Cluster medoids are then returned as a set of k variables.
Function step_kmedoids
creates a new step whose class is of
the same name and inherits from step_sbf
, adds it to the
sequence of existing steps (if any) in the recipe, and returns the updated
recipe. For the tidy
method, a tibble with columns terms
(selectors or variables selected), cluster
assignments,
selected
(logical indicator of selected cluster medoids),
silhouette
(silhouette values), and name
of the selected
variable names.
Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. Wiley.
Reynolds, A., Richards, G., de la Iglesia, B., & RaywardSmith, V. (1992). Clustering rules: A comparison of partitioning and hierarchical clustering algorithms. Journal of Mathematical Modelling and Algorithms, 5, 475504.
pam
, clara
,
recipe
, prep
,
bake
1 2 3 4 5 6 7 8 9 10 11 12  library(recipes)
rec < recipe(rating ~ ., data = attitude)
kmedoids_rec < rec %>%
step_kmedoids(all_predictors(), k = 3)
kmedoids_prep < prep(kmedoids_rec, training = attitude)
kmedoids_data < bake(kmedoids_prep, attitude)
pairs(kmedoids_data, lower.panel = NULL)
tidy(kmedoids_rec, number = 1)
tidy(kmedoids_prep, number = 1)

