xcluster: Hierarchical clustering

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/xcluster.R

Description

Performs a hierarchical cluster analysis on a set of dissimilarities (this function launch an external program: Xcluster).

Usage

1
xcluster(data,distance="euclidean",clean=FALSE,tmp.in="tmp.txt",tmp.out="tmp.gtr")

Arguments

data

a matrix (or data frame) which provides the data to analyze

distance

The distance measure used with Xcluster. This must be one of "euclidean", "pearson" or "notcenteredpearson". Any unambiguous substring can be given.

clean

a logical value indicating whether you want the true distances (clean=FALSE), or you want a clean dendrogram

tmp.in, tmp.out

temporary files for Xcluster

Details

Available distance measures are (written for two vectors x and y):

Xcluster does not use usual agglomerative methods (single, average, complete), but compute the distance between each groups' barycenter for the distance between two groups.

This have a problem for this kind of data:

A 0 0
B 0 1
C 0.9 0.5

Ie: a triangular in R^2, the distance between A and B is larger than the distance between the group A,B and C (with euclidean distance).

For that case it can be useful to use clean=TRUE and that mean that you must not consider A and B as a group without C.

Value

An object of class hclust which describes the tree produced by the clustering process. The object is a list with components:

merge

an n-1 by 2 matrix. Row i of merge describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation -j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in merge indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons.

height

a set of n-1 non-decreasing real values. The clustering height: that is, the value of the criterion associated with the clustering method for the particular agglomeration.

order

a vector giving the permutation of the original observations suitable for plotting, in the sense that a cluster plot using this ordering and matrix merge will not have crossings of the branches.

labels

labels for each of the objects being clustered.

call

the call which produced the result.

method

the cluster method that has been used.

dist.method

the distance that has been used to create d (only returned if the distance object has a "method" attribute).

Note

Xcluster is a C program made by Gavin Sherlock that performs hierarchical clustering, K-means and SOM.

Xcluster is copyrighted. To get or have information about Xcluster: http://genome-www.stanford.edu/~sherlock/cluster.html

Author(s)

Antoine Lucas, http://mulcyber.toulouse.inra.fr/projects/amap/

References

Antoine Lucas and Sylvain Jasson, Using amap and ctc Packages for Huge Clustering, R News, 2006, vol 6, issue 5 pages 58-60.

See Also

r2xcluster, xcluster2r,hclust, hcluster

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#    Create data
set.seed(1)
m <- matrix(rep(1,3*24),ncol=3)  
m[9:16,3] <- 3 ; m[17:24,] <- 3    #create 3 groups
m <- m+rnorm(24*3,0,0.5)           #add noise
m <- floor(10*m)/10                #just one digits


# And once you have Xcluster program:
#
#h <- xcluster(m)
#
#plot(h) 

ctc documentation built on Nov. 1, 2018, 3:45 a.m.