MbkmeansParam-class | R Documentation |
Run the mini-batch k-means mbkmeans
function with the specified number of centers within clusterRows
.
This sacrifices some accuracy for speed compared to the standard k-means algorithm.
Note that this requires installation of the mbkmeans package.
MbkmeansParam(
centers,
batch_size = NULL,
max_iters = 100,
num_init = 1,
init_fraction = NULL,
initializer = "kmeans++",
calc_wcss = FALSE,
early_stop_iter = 10,
tol = 1e-04,
BPPARAM = SerialParam()
)
## S4 method for signature 'ANY,MbkmeansParam'
clusterRows(x, BLUSPARAM, full = FALSE)
centers |
An integer scalar specifying the number of centers. Alternatively, a function that takes the number of observations and returns the number of centers. |
batch_size , max_iters , num_init , init_fraction , initializer , calc_wcss , early_stop_iter , tol , BPPARAM |
Further arguments to pass to |
x |
A numeric matrix-like object where rows represent observations and columns represent variables. |
BLUSPARAM |
A MbkmeansParam object. |
full |
Logical scalar indicating whether the full mini-batch k-means statistics should be returned. |
This class usually requires the user to specify the number of clusters beforehand. However, we can also allow the number of clusters to vary as a function of the number of observations. The latter is occasionally useful, e.g., to allow the clustering to automatically become more granular for large datasets.
To modify an existing MbkmeansParam object x
,
users can simply call x[[i]]
or x[[i]] <- value
where i
is any argument used in the constructor.
For batch_size
and init_fraction
, a value of NULL
means that the default arguments in the mbkmeans
function signature are used.
These defaults are data-dependent and so cannot be specified during construction of the MbkmeansParam object, but instead are defined within the clusterRows
method.
The MbkmeansParam
constructor will return a MbkmeansParam object with the specified parameters.
The clusterRows
method will return a factor of length equal to nrow(x)
containing the cluster assignments.
If full=TRUE
, a list is returned with clusters
(the factor, as above) and objects
(a list containing mbkmeans
, the direct output of mbkmeans
).
Stephanie Hicks
mbkmeans
from the mbkmeans package, which actually does all the heavy lifting.
KmeansParam, for dispatch to the standard k-means algorithm.
clusterRows(iris[,1:4], MbkmeansParam(centers=3))
clusterRows(iris[,1:4], MbkmeansParam(centers=3, batch_size=10))
clusterRows(iris[,1:4], MbkmeansParam(centers=3, init_fraction=0.5))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.