kmeans.weight: Weighted K-Means Clustering with Weights on Observations

Description Usage Arguments Value Author(s) See Also Examples

View source: R/kmeans_weight.R

Description

Perform K-Means algorithm on observations with given weights.

Usage

1
2
kmeans.weight(x, K = NULL, weight = NULL, centers = NULL,
  nstart = 20, algorithm = "Hartigan-Wong")

Arguments

x

An n by p numeric data matrix, and n is the number of observations and p the number of features.

K

The number of clusters. Omitted if centers are provided.

weight

A vector of n positive elements representing weights on observations.

centers

A K by p matrix indicating initial (distinct) cluster centers.

nstart

The number of initial random sets chosen from (distinct) rows in x. Omitted if centers is provided. Default is 20.

algorithm

Character; either "Hartigan-Wong" or "Forgy". Default is "Hartigan-Wong".

Value

The function returns a list of the following components:

centers

the centers of the clustering result.

cluster

a vector of integers (from 1:k) indicating the cluster to which each observation is allocated.

weight

a vector of non-zero weights in the input vector weight.

wcss

normalized within-cluster sum of squares, i.e. the objective divided by sum(weight).

Author(s)

Wenyu Zhang

See Also

Other sparse weighted K-Means functions: ChooseK, KMeansSparseCluster.permute.weight, KMeansSparseCluster.weight, kmeans.weight.tune

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Not run: 
set.seed(1)
data("DMdata")
# data preprocessing
data <- t(DMdata$data)
data_rank <- apply(data, 2, rank) 
data_rank_center<- t(t(data_rank) - colMeans(data_rank)) 
data_rank_center_scale <- t(t(data_rank_center)/apply(data_rank_center, 2, sd)) 
data_processed <-  t(data_rank_center_scale) 
# tune the number of cluster K
# nperms and nstart are set to be small in order to save computation time
cK <- ChooseK(data_processed[-DMdata$noisy.label,],nClusters = 1:6,nperms = 10,nstart = 5)
plot(cK)
K <- cK$OptimalK
# tune weight
  res.tuneU <- kmeans.weight.tune(x = data_processed,K = K,
  noisy.lab = DMdata$noisy.label,nperms = 10,nstart = 5)
plot(res.tuneU)
# perform weighted K-means
res <- kmeans.weight(x = data_processed,K = K,weight = res.tuneU$bestweight)
# check the result
table(res$cluster,DMdata$true.label)

## End(Not run)

Van1yu3/SWKM documentation built on Sept. 3, 2019, 7:50 a.m.