KMeansSparseCluster.weight: Sparse Weighted K-Means Clustering with Weights on...
In cuhklinlab/SWKM: Sparse Weighted K-Means

KMeansSparseCluster.weight

R Documentation

Sparse Weighted K-Means Clustering with Weights on Observations

Description

Perform sparse weighted K-Means algorithm on observations with given weights.

Usage

KMeansSparseCluster.weight(x, K = NULL, weight = NULL,
  wbounds = NULL, nstart = 20, silent = TRUE, maxiter = 6,
  centers = NULL)

Arguments

`x`	An n by p numeric data matrix, and n is the number of observations and p the number of features.
`K`	The number of clusters. Omitted if `centers` are provided.
`weight`	A vector of n positive elements representing weights on observations.
`wbounds`	A single L1 bound on w (the feature weights), or a vector of L1 bounds on w. If wbound is small, then few features will have non-zero weights. If wbound is large then all features will have non-zero weights. Should be greater than 1.
`nstart`	The number of initial random sets chosen from (distinct) rows in `x`. Omitted if `centers` is provided. Default is 20.
`silent`	Print out progress?
`maxiter`	The maximum number of iterations.
`centers`	A K by p matrix indicating initial (distinct) cluster centers.

Value

If wbounds is a numeric value, then the function returns a list with elements as follows:

`ws`	The p-vector of feature weights.
`Cs`	The clustering result.
`wcss`	A list of the following: `wcss.perfeature`, `wcss.ws`, `bcss.perfeature`, `bcss.ws`. Among them, `wcss.ws`=`sum(wcss.perfeaturews)`, `bcss.ws`=`sum(bcss.perfeaturews)`. And `bcss.ws` is the objective in sparse weighted K-Means clustering algorithm.
`wbound`	The L1 bound in the current list.
`weight`	The weights on observations.

If wbounds is a vector, then the function returns a list with lists (one per element of wbounds).

Author(s)

Wenyu Zhang

Examples

## Not run: 
set.seed(1)
data("NormalDisData")
cK <- ChooseK(NormalDisData$data[-NormalDisData$noisy.label,],nClusters = 1:6)
plot(cK)
K <- cK$OptimalK
res.tuneU <- kmeans.weight.tune(x = NormalDisData$data,K = K,
noisy.lab = NormalDisData$noisy.label,weight.seq = NULL)
plot(res.tuneU)
res.tunes <- KMeansSparseCluster.permute.weight(x = NormalDisData$data,K = K,
weight = res.tuneU$bestweight)
res <- KMeansSparseCluster.weight(x = NormalDisData$data,K = K,
wbounds = res.tunes$bestw,weight = res.tuneU$bestweight)
#check the clustering result, the number of features selected and the 50 most important features 
table(res[[1]]$Cs,NormalDisData$true.label)
sum(res[[1]]$ws!=0)
order(res[[1]]$ws,decreasing = TRUE)[1:50]

## End(Not run)

cuhklinlab/SWKM documentation built on Aug. 5, 2022, 2:27 a.m.