Learning of globally optimal classification and regression trees by using evolutionary algorithms.
a symbolic description of the model to be fit, no interactions should be used.
arguments controlling formula processing
optional integer vector of case weights.
a list of control arguments specified via
arguments passed to
Globally optimal classification and regression trees are learned by using
evolutionary algorithm. Roughly, the algorithm works as follows. First, a set of
trees is initialized with random split rules in the root nodes. Second, mutation
and crossover operators are applied to modify the trees' structure and the tests
that are applied in the internal nodes. After each modification step a survivor
selection mechanism selects the best candidate models for the next iteration. In
this evolutionary process the mean quality of the population increases over
time. The algorithm terminates when the quality of the best trees does not
improve further, but not later than a maximum number of iterations specified by
More details on the algorithm are provided Grubinger et al. (2014) which is also
vignette("evtree", package = "evtree").
The resulting trees can be summarized and visualized by the
plot.constparty methods provided by the partykit package.
predict.party method can be used to compute fitted responses,
probabilities (for classification trees), and nodes.
An object of class
Grubinger T, Zeileis A, Pfeiffer KP (2014). evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R. Journal of Statistical Software, 61(1), 1-29. http://www.jstatsoft.org/v61/i01/
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
## regression set.seed(1090) airq <- subset(airquality, !is.na(Ozone) & complete.cases(airquality)) ev_air <- evtree(Ozone ~ ., data = airq) ev_air plot(ev_air) mean((airq$Ozone - predict(ev_air))^2) ## classification ## (note that different equivalent "perfect" splits for the setosa species ## in the iris data may be found on different architectures/systems) ev_iris <- evtree(Species ~ .,data = iris) ## IGNORE_RDIFF_BEGIN ev_iris ## IGNORE_RDIFF_END plot(ev_iris) table(predict(ev_iris), iris$Species) 1 - mean(predict(ev_iris) == iris$Species)