cKmeans: Consensus K-means clustering
In TankredO/ckmeans: consensus K-means clustering

Description Usage Arguments Details Value Author(s) Examples

Consensus k means clustering

ckmeans(
  x,
  k,
  n_rep = 50,
  p_pred = 1,
  p_samp = 1,
  save_kms = TRUE,
  hclust_options = list(method = "average"),
  calc_bic = TRUE,
  ...
)

`x`	matrix with samples in rows and features in columns
`k`	number of clusters
`n_rep`	number of individual k means runs
`p_pred`	proportion of predictors used in every k means run
`p_samp`	proportion of samples used in every k means run
`save_kms`	logical, (or 'minimize') determining whether the k means object should be saved. This can be very memory demanding, depending on n_rep and x. If 'minimize', the kmeans objects are saved without column names, saving memory.
`hclust_options`	list of option passed to hclust, which is used to generate consensus clusters
`calc_bic`	logical, determining whether the BIC (Bayesian Information Criterion) should be calculated for the k means runs
`...`	arguments passed to kmeans

Runs several independent k means clustering steps, and combines information from the different runs to calculate consensus clusters using hierarchical clustering. The hierarchical clustering is based on the proportion of runs in which each pair of samples is placed in the same cluster, interpreted as distance.

ckmeans object

Tankred Ott

## generate data
x1 = c(rnorm(10), rnorm(14, 5, 1) + 2, rnorm(30, -5, 1))
x2 = c(rnorm(10), rnorm(14, 5, 1) + 2, rnorm(30, -5, 1))
x3 = c(rnorm(10)-3, rnorm(14, 2, 1) + 2, rnorm(30, 1, 1))
x = matrix(c(x1, x2, x3), ncol = 3, dimnames = list(1:54, c('x1', 'x2', 'x3')))

pairs(x)

## run ckmeans for a single K
ckm = ckmeans(x, 3, n_rep = 100, p_samp = 0.5, p_pred = 0.5)

# plot consensus matrix with color coded clusters
plot(ckm, cex.axis = 0.75)

plotDist(ckm)



# plot(x, col=ckm$cc, pch=c(rep(1, 10), rep(2, 14)))


## run ckmeans for multiple K
ckms = multickmeans(x, 1:7, n_rep = 100, p_samp = 0.8, p_pred = 0.5)
plot(ckms$bics, type='l')
plot(ckms$aics, type='l')
plot(ckms$sils, type='l')
plot(ckms$dbs, type='l')

ckms$

ckms$aics

for (i in 1:length(ckms$ckms)) {
  plotDist(ckms$ckms[[i]], ord=TRUE)
}