initialize_clusters: Cluster Initialization using a Heuristic Method
In MixtureMissing: Robust and Flexible Model-Based Clustering for Data Sets with Missing Values at Random

initialize_clusters

R Documentation

Cluster Initialization using a Heuristic Method

Description

Initialize cluster memberships and component parameters to start the EM algorithm using a heuristic clustering method or user-defined labels.

Usage

initialize_clusters(
  X,
  G,
  init_method = c("kmedoids", "kmeans", "hierarchical", "mclust", "manual"),
  clusters = NULL
)

Arguments

`X`	An `n` x `d` matrix or data frame where `n` is the number of observations and `d` is the number of columns or variables. Alternately, `X` can be a vector of `n` observations.
`G`	The number of clusters, which must be at least 1. If `G = 1`, then user-defined `clusters` is ignored.
`init_method`	(optional) A string specifying the method to initialize the EM algorithm. "kmedoids" clustering is used by default. Alternative methods include "kmeans", "hierarchical", "manual". When "manual" is chosen, a vector `clusters` of length `n` must be specified. When `G = 1` and "kmedoids" clustering is used, the medoid will be returned, not the sample mean.
`clusters`	A numeric vector of length `n` that specifies the initial cluster memberships of the user when `init_method` is set to "manual". This argument is NULL by default, so that it is ignored whenever other given initialization methods are chosen.

Details

Available heuristic methods include k-medoids clustering, k-means clustering, and hierarchical clustering. Alternately, the user can also enter pre-specified cluster memberships, making other initialization methods possible. If the given data set contains missing values, only observations with complete records will be used to initialize clusters. However, in this case, except when G = 1, the resulting cluster memberships will be set to NULL since they represent those complete records rather than the original data set as a whole.

Value

A list with the following slots:

`pi`	Component mixing proportions.
`mu`	A `G` by `d` matrix where each row is the component mean vector.
`Sigma`	A `G`-dimensional array where each `d` by `d` matrix is the component covariance matrix.
`clusters`	An numeric vector with values from 1 to `G` indicating initial cluster memberships if `X` is a complete data set; NULL otherwise.

References

Everitt, B., Landau, S., Leese, M., and Stahl, D. (2011). Cluster Analysis. John Wiley & Sons.

Kaufman, L. and Rousseeuw, P. J. (2009). Finding groups in data: an introduction to cluster analysis, volume 344. John Wiley & Sons.

Hartigan, J. A. and Wong, M. A. (1979). Algorithm AS 136: A K-means clustering algorithm. Applied Statistics, 28, 100-108. doi: 10.2307/2346830.

Examples


#++++ Initialization using a heuristic method ++++#

set.seed(1234)

init <- initialize_clusters(iris[1:4], G = 3)
init <- initialize_clusters(iris[1:4], G = 3, init_method = 'kmeans')
init <- initialize_clusters(iris[1:4], G = 3, init_method = 'hierarchical')

#++++ Initialization using user-defined labels ++++#

init <- initialize_clusters(iris[1:4], G = 3, init_method = 'manual',
                            clusters = as.numeric(iris$Species))

#++++ Initial parameters and pairwise scatterplot showing the mapping ++++#

init$pi
init$mu
init$Sigma
init$clusters

pairs(iris[1:4], col = init$clusters, pch = 16)

MixtureMissing documentation built on April 4, 2025, 3:38 a.m.

MixtureMissing index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

MixtureMissing
Robust and Flexible Model-Based Clustering for Data Sets with Missing Values at Random

initialize_clusters: Cluster Initialization using a Heuristic Method
In MixtureMissing: Robust and Flexible Model-Based Clustering for Data Sets with Missing Values at Random

Cluster Initialization using a Heuristic Method

Description

Usage

Arguments

Details

Value

References

Examples

Related to initialize_clusters in MixtureMissing...

R Package Documentation

Browse R Packages

We want your feedback!

MixtureMissing Robust and Flexible Model-Based Clustering for Data Sets with Missing Values at Random

initialize_clusters: Cluster Initialization using a Heuristic Method In MixtureMissing: Robust and Flexible Model-Based Clustering for Data Sets with Missing Values at Random

Cluster Initialization using a Heuristic Method

Description

Usage

Arguments

Details

Value

References

Examples

Related to initialize_clusters in MixtureMissing...

R Package Documentation

Browse R Packages

We want your feedback!

MixtureMissing
Robust and Flexible Model-Based Clustering for Data Sets with Missing Values at Random

initialize_clusters: Cluster Initialization using a Heuristic Method
In MixtureMissing: Robust and Flexible Model-Based Clustering for Data Sets with Missing Values at Random