wcls/bcls.matrix | R Documentation |
Functions compute two base matrix cluster scatter measures.
wcls.matrix(data,clust,cluster.center)
bcls.matrix(cluster.center,cluster.size,mean)
data |
|
clust |
integer |
cluster.center |
|
cluster.size |
integer |
mean |
mean of all data objects. |
There are two base matrix scatter measures.
1. within-cluster scatter measure defined as:
W = sum(forall k in 1:cluster.num) W(k)
where W(k) = sum(forall x) (x - m(k))*(x - m(k))'
x | - object belongs to cluster k, |
m(k) | - center of cluster k. |
2. between-cluster scatter measure defined as:
B = sum(forall k in 1:cluster.num) |C(k)|*( m(k) - m )*( m(k) - m )'
|C(k)| | - size of cluster k, |
m(k) | - center of cluster k, |
m | - center of all data objects. |
wcls.matrix | returns W matrix (within-cluster scatter measure), |
bcls.matrix | returns B matrix (between-cluster scatter measure). |
Lukasz Nieweglowski
T. Hastie, R. Tibshirani, G. Walther Estimating the number of data clusters via the Gap statistic, http://citeseer.ist.psu.edu/tibshirani00estimating.html
# load and prepare data
library(clv)
data(iris)
iris.data <- iris[,1:4]
# cluster data
pam.mod <- pam(iris.data,5) # create five clusters
v.pred <- as.integer(pam.mod$clustering) # get cluster ids associated to given data objects
# compute cluster sizes, center of each cluster
# and mean from data objects
cls.attr <- cls.attrib(iris.data, v.pred)
center <- cls.attr$cluster.center
size <- cls.attr$cluster.size
iris.mean <- cls.attr$mean
# compute matrix scatter measures
W.matrix <- wcls.matrix(iris.data, v.pred, center)
B.matrix <- bcls.matrix(center, size, iris.mean)
T.matrix <- W.matrix + B.matrix
# example of indices based on W, B i T matrices
mx.scatt.crit1 = sum(diag(W.matrix))
mx.scatt.crit2 = sum(diag(B.matrix))/sum(diag(W.matrix))
mx.scatt.crit3 = det(W.matrix)/det(T.matrix)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.