# xcluster: Hierarchical clustering In ctc: Cluster and Tree Conversion.

## Description

Performs a hierarchical cluster analysis on a set of dissimilarities (this function launch an external program: Xcluster).

## Usage

 `1` ```xcluster(data,distance="euclidean",clean=FALSE,tmp.in="tmp.txt",tmp.out="tmp.gtr") ```

## Arguments

 `data` a matrix (or data frame) which provides the data to analyze `distance` The distance measure used with Xcluster. This must be one of `"euclidean"`, `"pearson"` or `"notcenteredpearson"`. Any unambiguous substring can be given. `clean` a logical value indicating whether you want the true distances (`clean=FALSE`), or you want a clean dendrogram `tmp.in, tmp.out` temporary files for Xcluster

## Details

Available distance measures are (written for two vectors x and y):

• Euclidean: Usual square distance between the two vectors (2 norm).

• Pearson: 1 - cor(x,y)

• Pearson not centered: 1 - [ sum x_i y_i ] / sqrt[ sum x_i^2 * sum y_i^2 ]

Xcluster does not use usual agglomerative methods (single, average, complete), but compute the distance between each groups' barycenter for the distance between two groups.

This have a problem for this kind of data:

 A 0 0 B 0 1 C 0.9 0.5

Ie: a triangular in R^2, the distance between A and B is larger than the distance between the group A,B and C (with euclidean distance).

For that case it can be useful to use `clean=TRUE` and that mean that you must not consider A and B as a group without C.

## Value

An object of class hclust which describes the tree produced by the clustering process. The object is a list with components:

 `merge` an n-1 by 2 matrix. Row i of `merge` describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation -j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in `merge` indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons. `height` a set of n-1 non-decreasing real values. The clustering height: that is, the value of the criterion associated with the clustering `method` for the particular agglomeration. `order` a vector giving the permutation of the original observations suitable for plotting, in the sense that a cluster plot using this ordering and matrix `merge` will not have crossings of the branches. `labels` labels for each of the objects being clustered. `call` the call which produced the result. `method` the cluster method that has been used. `dist.method` the distance that has been used to create `d` (only returned if the distance object has a `"method"` attribute).

## Note

Xcluster is a C program made by Gavin Sherlock that performs hierarchical clustering, K-means and SOM.

## Author(s)

Antoine Lucas, http://mulcyber.toulouse.inra.fr/projects/amap/

## References

Antoine Lucas and Sylvain Jasson, Using amap and ctc Packages for Huge Clustering, R News, 2006, vol 6, issue 5 pages 58-60.

`r2xcluster`, `xcluster2r`,`hclust`, `hcluster`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13``` ```# Create data set.seed(1) m <- matrix(rep(1,3*24),ncol=3) m[9:16,3] <- 3 ; m[17:24,] <- 3 #create 3 groups m <- m+rnorm(24*3,0,0.5) #add noise m <- floor(10*m)/10 #just one digits # And once you have Xcluster program: # #h <- xcluster(m) # #plot(h) ```