clukm: Cluster analysis via K-means algorithm

View source: R/lmomRFA.r

clukmR Documentation

Cluster analysis via K-means algorithm

Description

Performs cluster analysis using the K-means algorithm.

Usage

clukm(x, assign, maxit = 10, algorithm = "Hartigan-Wong")

Arguments

x

A numeric matrix (or a data frame with all numeric columns, which will be coerced to a matrix). Contains the data: each row should contain the attributes for a single point.

assign

A vector whose distinct values indicate the initial clustering of the points.

maxit

Maximum number of iterations.

algorithm

Clustering algorithm. Permitted values are the same as for kmeans.

Value

An object of class kmeans. For details see the help for kmeans.

Note

clukm is a wrapper for the R function kmeans. The only difference is that in clukm the user supplies an initial assignment of sites to clusters (from which cluster centers are computed), whereas in kmeans the user supplies the initial cluster centers explicitly.

Author(s)

J. R. M. Hosking jrmhosking@gmail.com

References

Hosking, J. R. M., and Wallis, J. R. (1997). Regional frequency analysis: an approach based on L-moments. Cambridge University Press.

See Also

kmeans

Examples

## Clustering of gaging stations in Appalachia, as in Hosking
## and Wallis (1997, sec. 9.2.3)
data(Appalach)
# Form attributes for clustering (Hosking and Wallis's Table 9.4)
att <- cbind(a1 = log(Appalach$area),
             a2 = sqrt(Appalach$elev),
             a3 = Appalach$lat,
             a4 = Appalach$long)
att <- apply(att, 2, function(x) x/sd(x))
att[,1] <- att[,1] * 3
# Clustering by Ward's method
(cl <- cluagg(att))
# Details of the clustering with 7 clusters
(inf <- cluinf(cl, 7))
# Refine the 7 clusters by K-means
clkm <- clukm(att, inf$assign)
# Compare the original and K-means clusters
table(Kmeans=clkm$cluster, Ward=inf$assign)
# Some details about the K-means clusters: range of area, number
# of sites, weighted average L-CV and L-skewness
bb <- by(Appalach, clkm$cluster, function(x)
  c( min.area = min(x$area),
     max.area = max(x$area),
     n = nrow(x),
     ave.t = round(weighted.mean(x$t, x$n), 3),
     ave.t_3 = round(weighted.mean(x$t_3, x$n), 3)))
# Order the clusters in increasing order of minimum area
ord <- order(sapply(bb, "[", "min.area"))
# Make the result into a data frame.  Compare with Hosking
# and Wallis (1997), Table 9.5.
do.call(rbind, bb[ord])

lmomRFA documentation built on Aug. 29, 2023, 9:07 a.m.