View source: R/k-means_clustering.R
kmeans_clustering | R Documentation |
Perform k-means clustering on Cell-ID data.
kmeans_clustering( x, k = 10, max_iter = 100, resume = FALSE, label_col = "k", var_cats = NULL, custom_vars = NULL, plot_progress = F, return_list = F )
x |
cell.data object or a cell.data data.frame |
k |
either a non-negative integer setting the desired number of clusters, or a data.frame with |
max_iter |
The maximum number of iterations allowed. |
resume |
logical. If |
label_col |
optional string specifying the column containing pre-defined clusters used when |
var_cats |
optional character vector specifying whether pre-defined sets of morphological ( |
custom_vars |
optional character vector specifying custom variables to be included for clustering. These are added to any variable sets specified by |
K-means clusters data by assigning each row to the nearest cluster based on its Euclidean distance to the center (centroid) of all clusters. After assigning all rows, centroid positions are updated by calculating the column means of all rows assigned to each cluster. Row assignment and centroid updates are performed iteratively until the algorithm converges, i.e., no rows are re-assigned after centroid positions have been updated.
The number of clusters is defined by the parameter k
, and clustering can be either completely unsupervised (k
is a number only setting the desired number of clusters), or semi-supervised (k
is a data.frame of ucid
and t.frame
pairs defining which rows/cells to choose as starting centroids). If unsupervised, starting centroids are chosen randomly by sampling k
rows from data. Semi-supervised clustering can also be achieved by indicating a column of pre-defined labels assigned to a subset of rows, which will then be used to calculate the positions of the starting centroids.
Note that this algorithm does not guarantee to find the optimum.
Depending on the data type provided by x
, either a cell.data object or a cell.data data.frame with appended columns k
and k.dist
, indicating the assigned cluster and Euclidean distance to the cluster centroid, respectively.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.