rpart_utils | R Documentation |
rpart
utilitiesUtilities for the rpart package:
rpart_parent
returns all parent nodes of node
, i.e., the
path from node 1
to node
.
rpart_subset
and rpart_subset2
(in examples) return a
subset of the data used in rpart
for any intermediate or terminal
node
.
rpart_nodes
returns the terminal node label for each observation
in the original data frame used for tree
.
rpart_parent(node = 1L)
rpart_subset(tree, node = 1L)
rpart_nodes(tree, node_labels = FALSE, droplevels = TRUE)
node |
an integer representing the node number |
tree |
an object returned from |
node_labels |
a vector of labels having the same length as the number of terminal nodes or total nodes |
droplevels |
logical; if |
rpart_parent
returns a vector representing the path from the root to
node
.
rpart_subset
returns the data frame of observations in node
.
For any tree
, the possibilities
rpart_nodes
returns a factor variable
https://stackoverflow.com/q/36086990/2994949
https://stackoverflow.com/q/36748531/2994949
rpart_parent(116)
rpart_parent(29)
## Not run:
library('rpart')
fit <- rpart(Kyphosis ~ Age + Number + Start, kyphosis, minsplit = 5)
## children nodes should have identical paths
identical(
head(rpart_parent(28), -1L),
head(rpart_parent(29), -1L)
)
## terminal nodes should combine to original data
nodes <- as.integer(rownames(fit$frame[fit$frame$var %in% '<leaf>', ]))
sum(sapply(nodes, function(x) nrow(rpart_subset(fit, x))))
nrow(kyphosis)
## all nodes
nodes <- as.integer(rownames(fit$frame))
sapply(nodes, function(x) nrow(rpart_subset(fit, x)))
rpart_subset2 <- function(tree, node = 1L) {
require('partykit')
ptree <- as.party(tree)
ptree$data <- model.frame(eval(tree$call$data, parent.frame(1L)))
## retain transformed variables but drop those not in formula
## http://stackoverflow.com/a/36816883/2994949
# ptree$data <- model.frame(tree)
data_party(ptree, node)[, seq_along(ptree$data)]
}
## note differences in node labels in party vs rpart
dim(rpart_subset(fit, 4))
dim(rpart_subset2(fit, 3))
rpart_nodes(fit)
rpart_nodes(fit, TRUE)
table(rpart_nodes(fit, letters[1:10]),
rpart_nodes(fit, letters[1:19]))
## subset an rpart object by node id which should only include
## observations found in children of the node id(s) selected
identical(kyphosis, rpart_subset(fit, unique(rpart_nodes(fit))))
kyphosis$node <- rpart_nodes(fit)
rpart_subset(fit, 14:15)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.