View source: R/step_kmedoids.R
step_kmedoids | R Documentation |
Creates a specification of a recipe step that will partition numeric variables according to k-medoids clustering and select the cluster medoids.
step_kmedoids(
recipe,
...,
k = 5,
center = TRUE,
scale = TRUE,
method = c("pam", "clara"),
metric = "euclidean",
optimize = FALSE,
num_samp = 50,
samp_size = 40 + 2 * k,
replace = TRUE,
prefix = "KMedoids",
role = "predictor",
skip = FALSE,
id = recipes::rand_id("kmedoids")
)
## S3 method for class 'step_kmedoids'
tunable(x, ...)
recipe |
recipe object to which the step will be added. |
... |
one or more selector functions to choose which variables will be
used to compute the components. See |
k |
number of k-medoids clusterings of the variables. The value of
|
center , scale |
logicals indicating whether to mean center and median absolute deviation scale the original variables prior to cluster partitioning, or functions or names of functions for the centering and scaling; not applied to selected variables. |
method |
character string specifying one of the clustering methods
provided by the cluster package. The |
metric |
character string specifying the distance metric for calculating
dissimilarities between observations as |
optimize |
logical indicator or 0:5 integer level specifying
optimization for the |
num_samp |
number of sub-datasets to sample for the
|
samp_size |
number of cases to include in each sub-dataset. |
replace |
logical indicating whether to replace the original variables. |
prefix |
if the original variables are not replaced, the selected variables are added to the dataset with the character string prefix added to their names; otherwise, the original variable names are retained. |
role |
analysis role that added step variables should be assigned. By default, they are designated as model predictors. |
skip |
logical indicating whether to skip the step when the recipe is
baked. While all operations are baked when |
id |
unique character string to identify the step. |
x |
|
K-medoids clustering partitions variables into k groups such that the dissimilarity between the variables and their assigned cluster medoids is minimized. Cluster medoids are then returned as a set of k variables.
Function step_kmedoids
creates a new step whose class is of
the same name and inherits from step_sbf
, adds it to the
sequence of existing steps (if any) in the recipe, and returns the updated
recipe. For the tidy
method, a tibble with columns terms
(selectors or variables selected), cluster
assignments,
selected
(logical indicator of selected cluster medoids),
silhouette
(silhouette values), and name
of the selected
variable names.
Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. Wiley.
Reynolds, A., Richards, G., de la Iglesia, B., & Rayward-Smith, V. (1992). Clustering rules: A comparison of partitioning and hierarchical clustering algorithms. Journal of Mathematical Modelling and Algorithms, 5, 475-504.
pam
, clara
,
recipe
, prep
,
bake
library(recipes)
rec <- recipe(rating ~ ., data = attitude)
kmedoids_rec <- rec %>%
step_kmedoids(all_predictors(), k = 3)
kmedoids_prep <- prep(kmedoids_rec, training = attitude)
kmedoids_data <- bake(kmedoids_prep, attitude)
pairs(kmedoids_data, lower.panel = NULL)
tidy(kmedoids_rec, number = 1)
tidy(kmedoids_prep, number = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.