treeClust: Build a tree-based dissimilarity for clustering, and...
In treeClust: Cluster Distances Through Trees

Description Usage Arguments Details Value Author(s) References See Also Examples

This function uses a set of classification or regression trees to build an inter-point dissimilarity in which two points are similar when they tend to fall in the same leaves of trees. The user can pass in a clustering algorithm and/or ask for the dissimilarities or the set of trees.

1 2	treeClust(dfx, d.num = 1, col.range = 1:ncol(dfx), verbose = F, final.algorithm, k, control = treeClust.control(), rcontrol = rpart.control(), ...)

`dfx`	Input data frame. Columns may be numeric or categorical. Missing values are permitted.
`d.num`	Integer: Dissimilarity specifier. When d.num = 1, the dissimilarity between two observations is the proportion of trees where they disagree. With d.num = 2, those counts are weighted according to tree quality. In d.num = 3, dissimilarities are variable with trees, reflecting the belief that some pairs of leaves are closer together than others. With d.num = 4, those dissimilarities are weighted by tree quality.
`col.range`	Integer: the indices of the columns used. Defaults to all.
`verbose`	If non-zero, print degugging messages to the screen.
`final.algorithm`	Final algorithm, to be used to cluster the computed distances. This may be "pam", "agnes", "clara" or "kmeans".
`k`	If final.algorithm is supplied, the number of clusters is required.
`control`	List of the sort produced by `treeClust.control`, giving specifications for the fitting routine.
`rcontrol`	List of the sort produced by `rpart.control`, giving arguments for the rpart routine.
`...`	Other arguments, to be passed to the final clustering algorithm if specified.

The treeClust approach builds a set of classification or regresion trees, one for each variable. Trees are pruned, and those that are pruned to the root are discarded. For each remaining tree, an observation's leaf membership serves as the starting point for a dissimilarity measurement.

If control$cluster.only is TRUE, a vector of cluster assignments, as produced by the final algorthm. Otherwise, a list with these items:

`call`	The call that produced the object
`d.num`	d.num, as supplied
`tbl`	Two-column matrix with one row for each tree retained, giving size and deviance ratio
`extended.tbl`	Two-column matrix like tbl, but with one row for every variable, giving size and deviance ratio (these will be 1 and 0 for variables whose trees were discarded
`final.algorithm`	final.algorithm, as supplied
`final.clust`	If final.algorithm is supplied, the output from the final clustering algorithm; otherwise, NULL
`additional.args`	Any additional arguments specified
`tree`	If control$return.trees is TRUE, a list holding all the retained trees. This can make the resulting object very large.
`dists`	If control$return.dists is TRUE, an object of class dist with the set of pairwise inter-point dissimilarities
`mat`	If control$return.mat is TRUE, a data frame. If final.algorithm is "pam" or "agnes" this contains leaf assignment indices. Otherwise this holds a dataset useful as input to k-means or clara. Experimental.

Sam Buttrey, buttrey@nps.edu

Buttrey and Whitaker, "treeClust: An R Package for Tree-Based Clustering Dissimilarities," The R Journal, 7/2, 2015.

treeClust.control

1 2	iris.km6 <- treeClust (iris [,-5], d.num = 2, final.algorithm = "kmeans", k=6) table (iris.km6$final.clust$cluster, iris$Species)

Loading required package: rpart
Loading required package: cluster
   
    setosa versicolor virginica
  1      0          1        12
  2     50          0         0
  3      0          5         9
  4      0          0        10
  5      0         44         0
  6      0          0        19

treeClust documentation built on May 1, 2019, 7:59 p.m.

treeClust index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

treeClust
Cluster Distances Through Trees

treeClust: Build a tree-based dissimilarity for clustering, and...
In treeClust: Cluster Distances Through Trees

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to treeClust in treeClust...

R Package Documentation

Browse R Packages

We want your feedback!

treeClust Cluster Distances Through Trees

treeClust: Build a tree-based dissimilarity for clustering, and... In treeClust: Cluster Distances Through Trees

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to treeClust in treeClust...

R Package Documentation

Browse R Packages

We want your feedback!

treeClust
Cluster Distances Through Trees

treeClust: Build a tree-based dissimilarity for clustering, and...
In treeClust: Cluster Distances Through Trees