Hierarchical Clustering
Description
Hierarchical cluster analysis.
Usage
1 2 3 
Arguments
x 
A numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns). Or an object of class "exprSet". 
method 
the distance measure to be used. This must be one of

diag 
logical value indicating whether the diagonal of the
distance matrix should be printed by 
upper 
logical value indicating whether the upper triangle of the
distance matrix should be printed by 
link 
the agglomeration method to be used. This should
be (an unambiguous abbreviation of) one of

members 

nbproc 
integer, number of subprocess for parallelization [Linux & Mac only] 
doubleprecision 
True: use of double precision for distance matrix computation; False: use simple precision 
Details
This function is a mix of function hclust
and function
dist
. hcluster(x, method = "euclidean",link = "complete")
= hclust(dist(x, method = "euclidean"),method = "complete"))
It use twice less memory, as it doesn't store distance matrix.
For more details, see documentation of hclust
and Dist
.
Value
An object of class hclust which describes the tree produced by the clustering process. The object is a list with components:
merge 
an n1 by 2 matrix.
Row i of 
height 
a set of n1 nondecreasing real values.
The clustering height: that is, the value of
the criterion associated with the clustering

order 
a vector giving the permutation of the original
observations suitable for plotting, in the sense that a cluster
plot using this ordering and matrix 
labels 
labels for each of the objects being clustered. 
call 
the call which produced the result. 
method 
the cluster method that has been used. 
dist.method 
the distance that has been used to create 
There is a print
and a plot
method for
hclust
objects.
The plclust()
function is basically the same as the plot method,
plot.hclust
, primarily for back compatibility with Splus. Its
extra arguments are not yet implemented.
Note
Multithread (parallelisation) is disable on Windows.
Author(s)
The hcluster
function is based on C code adapted from Cran
Fortran routine
by Antoine Lucas http://mulcyber.toulouse.inra.fr/projects/amap/.
References
Antoine Lucas and Sylvain Jasson, Using amap and ctc Packages for Huge Clustering, R News, 2006, vol 6, issue 5 pages 5860.
See Also
Dist
,
hclust
, kmeans
.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45  data(USArrests)
hc < hcluster(USArrests,link = "ave")
plot(hc)
plot(hc, hang = 1)
## Do the same with centroid clustering and squared Euclidean distance,
## cut the tree into ten clusters and reconstruct the upper part of the
## tree from the cluster centers.
hc < hclust(dist(USArrests)^2, "cen")
memb < cutree(hc, k = 10)
cent < NULL
for(k in 1:10){
cent < rbind(cent, colMeans(USArrests[memb == k, , drop = FALSE]))
}
hc1 < hclust(dist(cent)^2, method = "cen", members = table(memb))
opar < par(mfrow = c(1, 2))
plot(hc, labels = FALSE, hang = 1, main = "Original Tree")
plot(hc1, labels = FALSE, hang = 1, main = "Restart from 10 clusters")
par(opar)
## other combinaison are possible
hc < hcluster(USArrests,method = "euc",link = "ward", nbproc= 1,
doubleprecision = TRUE)
hc < hcluster(USArrests,method = "max",link = "single", nbproc= 2,
doubleprecision = TRUE)
hc < hcluster(USArrests,method = "man",link = "complete", nbproc= 1,
doubleprecision = TRUE)
hc < hcluster(USArrests,method = "can",link = "average", nbproc= 2,
doubleprecision = TRUE)
hc < hcluster(USArrests,method = "bin",link = "mcquitty", nbproc= 1,
doubleprecision = FALSE)
hc < hcluster(USArrests,method = "pea",link = "median", nbproc= 2,
doubleprecision = FALSE)
hc < hcluster(USArrests,method = "abspea",link = "median", nbproc= 2,
doubleprecision = FALSE)
hc < hcluster(USArrests,method = "cor",link = "centroid", nbproc= 1,
doubleprecision = FALSE)
hc < hcluster(USArrests,method = "abscor",link = "centroid", nbproc= 1,
doubleprecision = FALSE)
hc < hcluster(USArrests,method = "spe",link = "complete", nbproc= 2,
doubleprecision = FALSE)
hc < hcluster(USArrests,method = "ken",link = "complete", nbproc= 2,
doubleprecision = FALSE)
