KMeansSparseCluster.weight: Sparse Weighted K-Means Clustering with Weights on...

Description Usage Arguments Value Author(s) See Also Examples

Description

Perform sparse weighted K-Means algorithm on observations with given weights.

Usage

1
2
3
KMeansSparseCluster.weight(x, K = NULL, weight = NULL,
  wbounds = NULL, nstart = 20, silent = TRUE, maxiter = 6,
  centers = NULL)

Arguments

x

An n by p numeric data matrix, and n is the number of observations and p the number of features.

K

The number of clusters. Omitted if centers are provided.

weight

A vector of n positive elements representing weights on observations.

wbounds

A single L1 bound on w (the feature weights), or a vector of L1 bounds on w. If wbound is small, then few features will have non-zero weights. If wbound is large then all features will have non-zero weights. Should be greater than 1.

nstart

The number of initial random sets chosen from (distinct) rows in x. Omitted if centers is provided. Default is 20.

silent

Print out progress?

maxiter

The maximum number of iterations.

centers

A K by p matrix indicating initial (distinct) cluster centers.

Value

If wbounds is a numeric value, then the function returns a list with elements as follows:

ws

The p-vector of feature weights.

Cs

The clustering result.

wcss

A list of the following: wcss.perfeature, wcss.ws, bcss.perfeature, bcss.ws. Among them, wcss.ws=sum(wcss.perfeature*ws), bcss.ws=sum(bcss.perfeature*ws). And bcss.ws is the objective in sparse weighted K-Means clustering algorithm.

wbound

The L1 bound in the current list.

weight

The weights on observations.

If wbounds is a vector, then the function returns a list with lists (one per element of wbounds).

Author(s)

Wenyu Zhang

See Also

Other sparse weighted K-Means functions: ChooseK, KMeansSparseCluster.permute.weight, kmeans.weight.tune, kmeans.weight

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
set.seed(1)
data("NormalDisData")
cK <- ChooseK(NormalDisData$data[-NormalDisData$noisy.label,],nClusters = 1:6)
plot(cK)
K <- cK$OptimalK
res.tuneU <- kmeans.weight.tune(x = NormalDisData$data,K = K,
noisy.lab = NormalDisData$noisy.label,weight.seq = NULL)
plot(res.tuneU)
res.tunes <- KMeansSparseCluster.permute.weight(x = NormalDisData$data,K = K,
weight = res.tuneU$bestweight)
res <- KMeansSparseCluster.weight(x = NormalDisData$data,K = K,
wbounds = res.tunes$bestw,weight = res.tuneU$bestweight)
#check the clustering result, the number of features selected and the 50 most important features 
table(res[[1]]$Cs,NormalDisData$true.label)
sum(res[[1]]$ws!=0)
order(res[[1]]$ws,decreasing = TRUE)[1:50]

## End(Not run)

Van1yu3/SWKM documentation built on Sept. 3, 2019, 7:50 a.m.