treeClust.control: Parameters describing the output from a treeClust fit

Description Usage Arguments Details Value Author(s) See Also


This function produces a list that is used as input to treeClust to determine which items are preserved in the output.


treeClust.control(return.trees = FALSE, return.mat = TRUE, 
 return.dists = FALSE, return.newdata = FALSE, cluster.only = FALSE, 
 serule = 0, DevRatThreshold = 1, parallelnodes = 1, ...)



If TRUE, all the trees that go into the object are returned. This can make the treeClust object very large. Default FALSE.


If TRUE, return a matrix describing leaf membership. Default TRUE.


If TRUE, return an object of class 'dissimilarity' giving all pairwise distances between observations. This can be very large for large datasets. Default FALSE.


If TRUE, return only the clustering vector, which names the cluster into which each observation is places. Default FALSE.


If TRUE, return a numeric matrix describing leaf membership and/or inter-point distance (see "Details"). Default FALSE.


Describes how to prune the rpart trees. By default, each tree is pruned to the minimum error size. With serule > 0, each tree is pruned to the smallest size for which the cross-validated error is less than (min error) + (serule * sds).


Trees whose deviance ratio is greater than this number are presumed to have arisen from redundant variables. The predictor at the tree's root is dropped, a new tree built, and the new deviance ratio computed. this process is repeated until the resulting tree has deviance ratio less than or equal to the threshold. Default: 1 (do not drop any such trees).


Describes whether to use parallel processing by creating a "computing cluster" containing "parallelnodes" nodes. If that number is = 1 no cluster is created. Here "cluster" is referring to a set of nodes operating in parallel, not to the clustering of the data.


Other arguments, passed onto the output.


The "newdata" item is a numeric matrix that gives inter-point distances whose form depends on the "d.num" argument to treeClust(). When d.num = 1, each tree contributes a set of 0-1 dummy variables that serve as leaf membership indicators, and with d.num = 2, each tree's indicators are multiplied by that tree's "strength." With d.num = 3, a tree with k leaves contributes k-choose-2 columns, with the distances between distinct rows matching the d3 distances, and likewise with d.num = 4, a tree with k leaves produced k-choose-2 columns that have been weighted by tree strength.


list, with all the input arguments and their supplied or default values.


Sam Buttrey,

See Also


treeClust documentation built on May 1, 2019, 7:59 p.m.