kmeans: K-Means Clustering
In astamm/game: GAussian Mixture Embedding

Description Usage Arguments Value Methods (by class) Examples

This function performs k-means clustering of the data points in a data set.

kmeans(x, ...)

## Default S3 method:
kmeans(x, centers, iter.max = 10L, nstart = 1L,
  algorithm = c("Hartigan-Wong", "Lloyd", "Forgy", "MacQueen"),
  trace = FALSE)

## S3 method for class 'sgd'
kmeans(x, k = 2, iter.max = 50L)

## S3 method for class 'gmd'
kmeans(x, k = 2, iter.max = 50L, d2 = NULL,
  method = "ward.D", rule = 2, shift = FALSE,
  avoid_mean_computation = FALSE)

`x`	A numeric matrix where each row is a data point or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns), an `sgd` object or a `gmd` object.
`...`	not used.
`centers`	either the number of clusters, say k, or a set of initial (distinct) cluster centres. If a number, a random set of (distinct) rows in `x` is chosen as the initial centres.
`iter.max`	the maximum number of iterations allowed.
`nstart`	if `centers` is a number, how many random sets should be chosen?
`algorithm`	character: may be abbreviated. Note that `"Lloyd"` and `"Forgy"` are alternative names for one algorithm.
`trace`	logical or integer number, currently only used in the default method (`"Hartigan-Wong"`): if positive (or true), tracing information on the progress of the algorithm is produced. Higher values may produce more tracing information.
`k`	The number of clusters to look for (default: `2L`).
`method`	character: may be abbreviated. `"centers"` causes `fitted` to return cluster centers (one for each input point) and `"classes"` causes `fitted` to return a vector of class assignments.

An object of class "kmeans" which as a print and a fitted methods. It is a list with at least the following components:

cluster: A vector of integers (among 1:k) indicating the cluster to which each point is allocated.
centers: A matrix of cluster centres.
totss: The total sum of squares.
withinss: Vector of within-cluster sum of squares, one component per cluster.
tot.withinss: Total within-cluster sum of squares.
betweenss: The between-cluster sum of squares.
size: The number of points in each cluster.
iter: The number of (outer) iterations.
ifault: integer: indicator of a possible algorithm problem – for experts.

default: This is the kmeans function of the stats package. We refer the user to the corresponding documentation for more details on the available algorithms and examples.
sgd: Implementation for Single Gaussian Data (stored in objects of class sgd).
gmd: Implementation for Gaussian Mixture Data (stored in objects of class gmd).

x <- sgd(
  c(mean =  0, precision = 1  ),
  c(mean =  3, precision = 0.5),
  c(mean = -1, precision = 2  )
)
kmeans(x)

N <- 100
M <- 4
w <- matrix(runif(N * M), N, M)
w <- w / rowSums(w)
samp <- tidyr::crossing(
  observation = paste0("O", 1:N),
  component = paste0("C", 1:M)
) %>%
dplyr::mutate(mixing = as.numeric(t(w)))
dict <- tibble::tibble(
  component = paste0("C", 1:M),
  mean = numeric(M),
  precision = 1:M
)
x <- gmd(samp, dict)
kx <- kmeans(x)