rpart.plot: Plot an rpart model. A simplified interface to the prp...
In rpart.plot: Plot 'rpart' Models: An Enhanced Version of 'plot.rpart'

rpart.plot

R Documentation

Plot an rpart model. A simplified interface to the prp function.

Description

Plot an rpart model, automatically tailoring the plot for the model's response type.

For an overview, please see the package vignette Plotting rpart trees with the rpart.plot package.

This function is a simplified front-end to prp, with only the most useful arguments of that function, and with different defaults for some of the arguments. The different defaults mean that this function automatically creates a colored plot suitable for the type of model (whereas prp by default creates a minimal plot). See the prp help page for a table showing the different defaults.

Usage

rpart.plot(x = stop("no 'x' arg"),
    type = 2, extra = "auto",
    under = FALSE, fallen.leaves = TRUE,
    digits = 2, varlen = 0, faclen = 0, roundint = TRUE,
    cex = NULL, tweak = 1,
    clip.facs = FALSE, clip.right.labs = TRUE,
    snip = FALSE,
    box.palette = "auto", shadow.col = 0,
    ...)

Arguments

To start off, look at the arguments x, type and extra. Just those arguments will suffice for many users. If you don't want a colored plot, use box.palette=0.

`x`	An `rpart` object. The only required argument.
`type`	Type of plot. Possible values: 0 Draw a split label at each split and a node label at each leaf. 1 Label all nodes, not just leaves. Similar to `text.rpart`'s `all=TRUE`. 2 Default. Like `1` but draw the split labels below the node labels. Similar to the plots in the CART book. 3 Draw separate split labels for the left and right directions. 4 Like `3` but label all nodes, not just leaves. Similar to `text.rpart`'s `fancy=TRUE`. See also `clip.right.labs`. 5 Show the split variable name in the interior nodes.
`extra`	Display extra information at the nodes. Possible values: "auto" (case insensitive) Default. Automatically select a value based on the model type, as follows: `extra=106` class model with a binary response `extra=104` class model with a response having more than two levels `extra=100` other models 0 No extra information. 1 Display the number of observations that fall in the node (per class for `class` objects; prefixed by the number of events for `poisson` and `exp` models). Similar to `text.rpart`'s `use.n=TRUE`. 2 Class models: display the classification rate at the node, expressed as the number of correct classifications and the number of observations in the node. Poisson and exp models: display the number of events. 3 Class models: misclassification rate at the node, expressed as the number of incorrect classifications and the number of observations in the node. 4 Class models: probability per class of observations in the node (conditioned on the node, sum across a node is 1). 5 Class models: like `4` but don't display the fitted class. 6 Class models: the probability of the second class only. Useful for binary responses. 7 Class models: like `6` but don't display the fitted class. 8 Class models: the probability of the fitted class. 9 Class models: The probability relative to all observations – the sum of these probabilities across all leaves is 1. This is in contrast to the options above, which give the probability relative to observations falling in the node – the sum of the probabilities across the node is 1. 10 Class models: Like `9` but display the probability of the second class only. Useful for binary responses. 11 Class models: Like `10` but don't display the fitted class. +100 Add `100` to any of the above to also display the percentage of observations in the node. For example `extra=101` displays the number and percentage of observations in the node. Actually, it's a weighted percentage using the `weights` passed to `rpart`. Note: Unlike `text.rpart`, by default `prp` uses its own routine for generating node labels (not the function attached to the object). See the `node.fun` argument of `prp`.
`under`	Applies only if `extra > 0`. Default `FALSE`, meaning put the extra text in the box. Use `TRUE` to put the text under the box.
`fallen.leaves`	Default `TRUE` to position the leaf nodes at the bottom of the graph. It can be helpful to use `FALSE` if the graph is too crowded and the text size is too small.
`digits`	The number of significant digits in displayed numbers. Default `2`. If `0`, use `getOption("digits")`. If negative, use the standard `format` function (with the absolute value of `digits`). When `digits` is positive, the following details apply: Numbers from `0.001` to `9999` are printed without an exponent (and the number of digits is actually only a suggestion, see `format` for details). Numbers out that range are printed with an “engineering” exponent (a multiple of 3).
`varlen`	Length of variable names in text at the splits (and, for class responses, the class in the node label). Default `0`, meaning display the full variable names. Possible values: 0 use full names (default). greater than 0 call `abbreviate` with the given `varlen`. less than 0 truncate variable names to the shortest length where they are still unique, but never truncate to shorter than `abs(varlen)`.
`faclen`	Length of factor level names in splits. Default `0`, meaning display the full factor names. Possible values are as `varlen` above, except that for back-compatibility with `text.rpart` the special value `1` means represent the factor levels with alphabetic characters (`a` for the first level, `b` for the second, etc.).
`roundint`	If `roundint=TRUE` (default) and all values of a predictor in the training data are integers, then splits for that predictor are rounded to integer. For example, display `nsiblings < 3` instead of `nsiblings < 2.5`. If `roundint=TRUE` and the data used to build the model is no longer available, a warning will be issued. Using `roundint=FALSE` is advised if non-integer values are in fact possible for a predictor, even though all values in the training data for that predictor are integral.
`cex`	Default `NULL`, meaning calculate the text size automatically. Since font sizes are discrete, the `cex` you ask for may not be exactly the `cex` you get.
`tweak`	Adjust the (possibly automatically calculated) `cex`. Using `tweak` is often easier than specifying `cex`. The default `tweak` is `1`, meaning no adjustment. Use say `tweak=1.2` to make the text 20% larger. Since font sizes are discrete, a small change to tweak may not actually change the type size, or change it more than you want.
`clip.facs`	Default `FALSE`. If `TRUE`, print splits on factors as `female` instead of `sex = female`; the variable name and equals is dropped. Another example: print `survived` or `died` rather than `survived = survived` or `survived = died`.
`clip.right.labs`	Applies only if `type=3` or `4`. Default is `TRUE` meaning “clip” the right-hand split labels, i.e., don't print `variable=`.
`snip`	Default `FALSE`. Set `TRUE` to interactively trim the tree with the mouse. See the package vignette (or just try it).
`box.palette`	Palette for coloring the node boxes based on the fitted value. This is a vector of `colors`, for example `box.palette=c("green", "green2", "green4")`. Small fitted values are displayed with colors at the start of the vector; large values with colors at the end. Quantiles are used to partition the fitted values. The special value `box.palette=0` (default for `prp`) uses the background color (typically white). The special value `box.palette="auto"` (default for `rpart.plot`, case insensitive) automatically selects a predefined palette based on the type of model. Otherwise specify a predefined palette e.g. `box.palette="Grays"` for the predefined gray palette (a range of grays). The predefined palettes are (see the `show.prp.palettes` function): `Grays` `Greys` `Greens` `Blues` `Browns` `Oranges` `Reds` `Purples` `Gy` `Gn` `Bu` `Bn` `Or` `Rd` `Pu` (alternative names for the above palettes) `BuGn` `GnRd` `BuOr` etc. (two-color diverging palettes: any combination of two of the above palettes) `RdYlGn` `GnYlRd` `BlGnYl` `YlGnBl` (three color palettes) Prefix the palette name with `"-"` to reverse the order of the colors e.g. `box.palette="-auto"` or `box.palette="-Grays"`.
`shadow.col`	Color of the shadow under the boxes. Default `0`, no shadow. Try `"gray"` or `"darkgray"`.
`...`	Extra arguments passed to `prp` and the plotting routines. Any of `prp`'s arguments can be used.

Value

The returned value is identical to that of prp.

Author(s)

Stephen Milborrow, borrowing heavily from the rpart package by Terry M. Therneau and Beth Atkinson, and the R port of that package by Brian Ripley.

Examples

old.par <- par(mfrow=c(2,2))            # put 4 figures on one page

data(ptitanic)

#---------------------------------------------------------------------------

binary.model <- rpart(survived ~ ., data = ptitanic, cp = .02)
                                        # cp = .02 for small demo tree

rpart.plot(binary.model,
           main = "titanic survived\n(binary response)")

rpart.plot(binary.model, type = 3, clip.right.labs = FALSE,
           branch = .4,
           box.palette = "Grays",       # override default GnBu palette
           main = "type = 3, clip.right.labs = FALSE, ...\n")

#---------------------------------------------------------------------------

anova.model <- rpart(Mileage ~ ., data = cu.summary)

rpart.plot(anova.model,
           shadow.col = "gray",         # add shadows just for kicks
           main = "miles per gallon\n(continuous response)\n")

#---------------------------------------------------------------------------

multi.class.model <- rpart(Reliability ~ ., data = cu.summary)

rpart.plot(multi.class.model,
           main = "vehicle reliability\n(multi class response)")

par(old.par)

rpart.plot documentation built on May 29, 2024, 12:07 p.m.