A class for representing binary trees.

Objects can be created by calls of the form `new("BinaryTree", ...)`

.
The most important slot is `tree`

, a (recursive) list with elements

- nodeID
an integer giving the number of the node, starting with

`1`

in the root node.- weights
the case weights (of the learning sample) corresponding to this node.

- criterion
a list with test statistics and p-values for each partial hypothesis.

- terminal
a logical specifying if this is a terminal node.

- psplit
primary split: a list with elements

`variableID`

(the number of the input variable splitted),`ordered`

(a logical whether the input variable is ordered),`splitpoint`

(the cutpoint or set of levels to the left),`splitstatistics`

saves the process of standardized two-sample statistics the split point estimation is based on. The logical`toleft`

determines if observations go left or right down the tree. For nominal splits, the slot`table`

is a vector being greater zero if the corresponding level is available in the corresponding node.- ssplits
a list of surrogate splits, each with the same elements as

`psplit`

.- prediction
the prediction of the node: the mean for numeric responses and the conditional class probabilities for nominal or ordered respones. For censored responses, this is the mean of the logrank scores and useless as such.

- left
a list representing the left daughter node.

- right
a list representing the right daugther node.

Please note that this data structure may be subject to change in future releases of the package.

`data`

:an object of class

`"ModelEnv"`

.`responses`

:an object of class

`"VariableFrame"`

storing the values of the response variable(s).`cond_distr_response`

:a function computing the conditional distribution of the response.

`predict_response`

:a function for computing predictions.

`tree`

:a recursive list representing the tree. See above.

`where`

:an integer vector of length n (number of observations in the learning sample) giving the number of the terminal node the corresponding observations is element of.

`prediction_weights`

:a function for extracting weights from terminal nodes.

`get_where`

:a function for determining the number of terminal nodes observations fall into.

`update`

:a function for updating weights.

Class `"BinaryTreePartition"`

, directly.

`response(object, ...)`

:extract the response variables the tree was fitted to.

`treeresponse(object, newdata = NULL, ...)`

:compute statistics for the conditional distribution of the response as modelled by the tree. For regression problems, this is just the mean. For nominal or ordered responses, estimated conditional class probabilities are returned. Kaplan-Meier curves are computed for censored responses. Note that a list with one element for each observation is returned.

`Predict(object, newdata = NULL, ...)`

:compute predictions.

`weights(object, newdata = NULL, ...)`

:extract the weight vector from terminal nodes each element of the learning sample is element of (

`newdata = NULL`

) and for new observations, respectively.`where(object, newdata = NULL, ...)`

:extract the number of the terminal nodes each element of the learning sample is element of (

`newdata = NULL`

) and for new observations, respectively.`nodes(object, where, ...)`

:extract the nodes with given number (

`where`

).`plot(x, ...)`

:a plot method for

`BinaryTree`

objects, see`plot.BinaryTree`

.`print(x, ...)`

:a print method for

`BinaryTree`

objects.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | ```
set.seed(290875)
airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq,
controls = ctree_control(maxsurrogate = 3))
### distribution of responses in the terminal nodes
plot(airq$Ozone ~ as.factor(where(airct)))
### get all terminal nodes from the tree
nodes(airct, unique(where(airct)))
### extract weights and compute predictions
pmean <- sapply(weights(airct), function(w) weighted.mean(airq$Ozone, w))
### the same as
drop(Predict(airct))
### or
unlist(treeresponse(airct))
### don't use the mean but the median as prediction in each terminal node
pmedian <- sapply(weights(airct), function(w)
median(airq$Ozone[rep(1:nrow(airq), w)]))
plot(airq$Ozone, pmean, col = "red")
points(airq$Ozone, pmedian, col = "blue")
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.