treeClust.control: Parameters describing the output from a treeClust fit

Description Usage Arguments Details Value Author(s) See Also

Description

This function produces a list that is used as input to treeClust to determine which items are preserved in the output.

Usage

1
2
3
treeClust.control(return.trees = FALSE, return.mat = TRUE, 
 return.dists = FALSE, return.newdata = FALSE, cluster.only = FALSE, 
 serule = 0, DevRatThreshold = 1, parallelnodes = 1, ...)

Arguments

return.trees

If TRUE, all the trees that go into the object are returned. This can make the treeClust object very large. Default FALSE.

return.mat

If TRUE, return a matrix describing leaf membership. Default TRUE.

return.dists

If TRUE, return an object of class 'dissimilarity' giving all pairwise distances between observations. This can be very large for large datasets. Default FALSE.

cluster.only

If TRUE, return only the clustering vector, which names the cluster into which each observation is places. Default FALSE.

return.newdata

If TRUE, return a numeric matrix describing leaf membership and/or inter-point distance (see "Details"). Default FALSE.

serule

Describes how to prune the rpart trees. By default, each tree is pruned to the minimum error size. With serule > 0, each tree is pruned to the smallest size for which the cross-validated error is less than (min error) + (serule * sds).

DevRatThreshold

Trees whose deviance ratio is greater than this number are presumed to have arisen from redundant variables. The predictor at the tree's root is dropped, a new tree built, and the new deviance ratio computed. this process is repeated until the resulting tree has deviance ratio less than or equal to the threshold. Default: 1 (do not drop any such trees).

parallelnodes

Describes whether to use parallel processing by creating a "computing cluster" containing "parallelnodes" nodes. If that number is = 1 no cluster is created. Here "cluster" is referring to a set of nodes operating in parallel, not to the clustering of the data.

...

Other arguments, passed onto the output.

Details

The "newdata" item is a numeric matrix that gives inter-point distances whose form depends on the "d.num" argument to treeClust(). When d.num = 1, each tree contributes a set of 0-1 dummy variables that serve as leaf membership indicators, and with d.num = 2, each tree's indicators are multiplied by that tree's "strength." With d.num = 3, a tree with k leaves contributes k-choose-2 columns, with the distances between distinct rows matching the d3 distances, and likewise with d.num = 4, a tree with k leaves produced k-choose-2 columns that have been weighted by tree strength.

Value

list, with all the input arguments and their supplied or default values.

Author(s)

Sam Buttrey, buttrey@nps.edu

See Also

treeClust


treeClust documentation built on May 1, 2019, 7:59 p.m.