Description Usage Arguments Value Methods (by generic) Author(s) See Also Examples
View source: R/kmeans_weight_tune.R
The tuning parameter controls the weight of noisy observations. A permutation approach is used to select the tuning parameter.
1 2 3 4 5 6 |
x |
An n by p numeric data matrix, and n is the number of observations and p the number of features. |
K |
The number of clusters. Omitted if |
noisy.lab |
A vector indicating the row positions of noisy observations. Omitted if |
weight.seq |
A candidate weight matrix, each row indicating one candidate weight vector. If |
nperms |
Number of permutations. Default is |
centers |
A K by p matrix indicating initial (distinct) cluster centers. |
nstart |
The number of initial random sets chosen from (distinct) rows in |
algorithm |
Character; either " |
... |
unused. |
The function returns a list of the following components:
gaps |
The gap statistics obtained (one for each of the candicate weights tried). If O(U) is the objective function evaluated at the tuning parameter U, and O*(U) is the same quantity but for the permuted weights, then Gap(U)=mean(log(O*(U)))-log(O(U)). |
sdgaps |
The standard deviation of log(O*(U)) |
bestweight |
The best weight chosen by this method among all the candidate weights. |
plot
: plot the Gap statistic of each candicate weight vector.
Wenyu Zhang
Other sparse weighted K-Means functions: ChooseK
,
KMeansSparseCluster.permute.weight
,
KMeansSparseCluster.weight
,
kmeans.weight
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ## Not run:
set.seed(1)
data("DMdata")
# data preprocessing
data <- t(DMdata$data)
data_rank <- apply(data, 2, rank)
data_rank_center<- t(t(data_rank) - colMeans(data_rank))
data_rank_center_scale <- t(t(data_rank_center)/apply(data_rank_center, 2, sd))
data_processed <- t(data_rank_center_scale)
# tune the number of cluster K
# nperms and nstart are set to be small in order to save computation time
cK <- ChooseK(data_processed[-DMdata$noisy.label,],nClusters = 1:6,nperms = 10,nstart = 5)
plot(cK)
K <- cK$OptimalK
# tune weight
res.tuneU <- kmeans.weight.tune(x = data_processed,K = K,
noisy.lab = DMdata$noisy.label,nperms = 10,nstart = 5)
plot(res.tuneU)
# perform weighted K-means
res <- kmeans.weight(x = data_processed,K = K,weight = res.tuneU$bestweight)
# check the result
table(res$cluster,DMdata$true.label)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.