View source: R/kmeans_weight_tune.R
kmeans.weight.tune | R Documentation |
The tuning parameter controls the weight of noisy observations. A permutation approach is used to select the tuning parameter.
kmeans.weight.tune(x, K = NULL, noisy.lab = NULL, weight.seq = NULL, nperms = 20, centers = NULL, nstart = 20, algorithm = "Hartigan-Wong") ## S3 method for class 'kmeans.weight.tune' plot(x, ...)
x |
An n by p numeric data matrix, and n is the number of observations and p the number of features. |
K |
The number of clusters. Omitted if |
noisy.lab |
A vector indicating the row positions of noisy observations. Omitted if |
weight.seq |
A candidate weight matrix, each row indicating one candidate weight vector. If |
nperms |
Number of permutations. Default is |
centers |
A K by p matrix indicating initial (distinct) cluster centers. |
nstart |
The number of initial random sets chosen from (distinct) rows in |
algorithm |
Character; either " |
... |
unused. |
The function returns a list of the following components:
gaps |
The gap statistics obtained (one for each of the candicate weights tried). If O(U) is the objective function evaluated at the tuning parameter U, and O*(U) is the same quantity but for the permuted weights, then Gap(U)=mean(log(O*(U)))-log(O(U)). |
sdgaps |
The standard deviation of log(O*(U)) |
bestweight |
The best weight chosen by this method among all the candidate weights. |
plot
: plot the Gap statistic of each candicate weight vector.
Wenyu Zhang
Other sparse weighted K-Means functions: ChooseK
,
KMeansSparseCluster.permute.weight
,
KMeansSparseCluster.weight
,
kmeans.weight
## Not run: set.seed(1) data("DMdata") # data preprocessing data <- t(DMdata$data) data_rank <- apply(data, 2, rank) data_rank_center<- t(t(data_rank) - colMeans(data_rank)) data_rank_center_scale <- t(t(data_rank_center)/apply(data_rank_center, 2, sd)) data_processed <- t(data_rank_center_scale) # tune the number of cluster K # nperms and nstart are set to be small in order to save computation time cK <- ChooseK(data_processed[-DMdata$noisy.label,],nClusters = 1:6,nperms = 10,nstart = 5) plot(cK) K <- cK$OptimalK # tune weight res.tuneU <- kmeans.weight.tune(x = data_processed,K = K, noisy.lab = DMdata$noisy.label,nperms = 10,nstart = 5) plot(res.tuneU) # perform weighted K-means res <- kmeans.weight(x = data_processed,K = K,weight = res.tuneU$bestweight) # check the result table(res$cluster,DMdata$true.label) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.