build_tree: Exploratory building of partition models

Description Usage Arguments Details Author(s) References See Also Examples

View source: R/build_tree.R

Description

A tool to choose the "correct" complexity parameter of a tree

Usage

1
build_tree(form, data, minbucket = 5, seed=NA, holdout, mincp=0)

Arguments

form

A formula describing the tree to be built

data

Data frame containing the variables to build the tree

minbucket

The minimum number of cases allowed in any leaf in the tree

seed

If given, specifies the random number seed so the crossvalidation error can be reproduced.

holdout

If given, the error on the holdout sample is calculated and given in the cp table.

mincp

The cp parameter to which the tree will be grown. By default it is 0 (recommended), but it can be changed for large datasets. A value of 0.0001 is likely reasonable.

Details

This command combines the action of building a tree to its maximum possible extent using rpart and looking at the results using getcp. A plot of the estimated relative generalization error (as determined by 10-fold cross validation) versus the number of splits is provided. In addition, the complexity parameter table giving the cp of the tree with the lowest error (and of the simplest tree with an error within one standard deviation of the lowest error) is reported.

If holdout is given, the RMSE/misclassification rate on the training and holdout samples are provided in the cp table.

Author(s)

Adam Petrie

References

Introduction to Regression and Modeling

See Also

rpart, getcp

Examples

1
2
3
4
5
6
  data(JUNK)
  build_tree(Junk~.,data=JUNK,seed=1337)
  data(CENSUS)
  build_tree(ResponseRate~.,data=CENSUS,seed=2017,mincp=0.001)
  data(OFFENSE)
  build_tree(Win~.,data=OFFENSE[1:200,],seed=2029,holdout=OFFENSE[201:352,])

regclass documentation built on March 26, 2020, 8:02 p.m.