itree: Recursive Partitioning and Regression Trees a la rpart, but...
In itree: Tools for classification and regression trees, with an emphasis on interpretability.

Description Usage Arguments Details Value References See Also Examples

Fit a itree model.

itree(formula, data, weights, subset, na.action = na.itree,
	 method, penalty= "none", model = FALSE, 
	 x = FALSE, y = TRUE, parms, control, cost, ...)

`formula`	a formula, with a response but no interaction terms.
`data`	an optional data frame in which to interpret the variables named in the formula.
`weights`	optional case weights.
`subset`	optional expression saying that only a subset of the rows of the data should be used in the fit.
`na.action`	the default action deletes all observations for which `y` is missing, but keeps those in which one or more predictors are missing.
`method`	one of `"anova"`, `"class"`, `"extremes"`, `"purity"`, `"class_extremes"`, `"class_purity"`, `"regression_extremes"`, or `"regression_purity"`. The purity and extremes methods are new to itree. Unlike rpart, itree does not currently support `method="poisson"` or `method="exp"`. If `method` is missing then the routine tries to make an intelligent guess – the default is the CART methodology, as in rpart. If `y` is a factor then `method="class"` is assumed, otherwise `method="anova"` is assumed. Passing a factor with `method="purity"` is equivalent to passing `method="class_purity"`, and similarly for extremes/regression. It is wisest to specify the method directly, especially as more criteria may added to the function in future. As in rpart, `method` can be a list of functions named `init`, `split` and `eval`. See the rpart documentation for how this works.
`penalty`	one of `"none"`, `"newvar"` or `"ema"`. The penalty for splitting a particular node on a specified predictor given the predictors already used in the branch leading to this node. Default is "none" which corresponds to CART. `"newvar"` penalizes predictors not used in the branch leading to the current node. `"ema"` implements an exponential moving average style penalty whereby recently used variables are favored.
`model`	if logical: keep a copy of the model frame in the result? If the input value for `model` is a model frame (likely from an earlier call to the `itree` function), then this frame is used rather than constructing new data.
`x`	keep a copy of the `x` matrix in the result.
`y`	keep a copy of the dependent variable in the result. If missing and `model` is supplied this defaults to `FALSE`.
`parms`	optional parameters for the splitting function. Anova splitting has no parameters. For classification splitting, the list can contain any of: the vector of prior probabilities (component `prior`), the loss matrix (component `loss`) or the splitting index (component `split`). The priors must be positive and sum to 1. The loss matrix must have zeros on the diagonal and positive off-diagonal elements. The splitting index can be `gini` or `information`. The default priors are proportional to the data counts, the losses default to 1, and the split defaults to `gini`. For the regression extremes method, `parms=1` or `parms=-1` specifies whether we are looking for high or low means respectively (see Buja & Lee). Default is `1` for high means. For classification extremes, parms is a list specificying the class of interest – see the examples for syntax.
`control`	a list of options that control details of the `itree` algorithm, similar to `rpart.control`. See `itree.control`.
`cost`	a vector of non-negative costs, one for each variable in the model. Defaults to one for all variables. These are scalings to be applied when considering splits, so the improvement on splitting on a variable is divided by its cost in deciding which split to choose. Note that costs are not currently supported by the extremes or purity methods.
`...`	arguments to `itree.control` may also be specified in the call to `itree`. They are checked against the list of valid arguments.

itree is based on the code of rpart, but with some extensions targeted at growing interpretable/parsimonious trees. Bug reports and the like should be directed to this package's maintainer – not rpart's.

An object of class itree. See itree.object.

Breiman, Friedman, Olshen, and Stone. (1984) Classification and Regression Trees.

Buja, Andreas and Lee, Yung-Seop (2001). Data Mining Criteria for Tree-Based Regression and Classification, Proceedings of KDD 2001, 27-36.

Wadsworth.

itree.control, itree.object, summary.itree, print.itree

#CART (same as rpart):
fit <- itree(Kyphosis ~ Age + Number + Start, data=kyphosis)
fit2 <- itree(Kyphosis ~ Age + Number + Start, data=kyphosis,
              parms=list(prior=c(.65,.35), split='information'))
fit3 <- itree(Kyphosis ~ Age + Number + Start, data=kyphosis,
              control=itree.control(cp=.05))
par(mfrow=c(1,2), xpd=NA) # otherwise on some devices the text is clipped
plot(fit)
text(fit, use.n=TRUE)
plot(fit2)
text(fit2, use.n=TRUE)

#### new to itree:
#same example, but using one-sided extremes:
fit.ext <- itree(Kyphosis ~ Age + Number + Start, data=kyphosis,method="extremes",
				parms=list(classOfInterest="absent"))
#we see buckets with every y="absent":
plot(fit.ext); text(fit.ext,use.n=TRUE) 


library(mlbench); data(BostonHousing)

#one sided purity:
fit4 <- itree(medv~.,BostonHousing,method="purity",minbucket=25)

#low means tree:
fit5 <- itree(medv~.,BostonHousing,method="extremes",parms=-1,minbucket=25)

#new variable penalty:
fit6 <- itree(medv~.,BostonHousing,penalty="newvar",interp_param1=.2)

#ema penalty
fit7 <- itree(medv~.,BostonHousing,penalty="ema",interp_param1=.1)

#one-sided-purity + new variable penalty:
fit8 <- itree(medv~.,BostonHousing,method="purity",penalty="newvar",interp_param1=.2)

#one-sided extremes for classification must specify a "class of interest"
data(PimaIndiansDiabetes)
levels(PimaIndiansDiabetes$diabetes)  
fit9.a <- itree(diabetes~.,PimaIndiansDiabetes,minbucket=50,
                 method="extremes",parms=list(classOfInterest="neg"))
                 
plot(fit9.a); text(fit9.a)

#can also pass the index of the class of interest in levels().
fit9.b <- itree(diabetes~.,PimaIndiansDiabetes,minbucket=50,
                 method="extremes",parms=list(classOfInterest=1))
# so fit9.a = fit9.b

itree is based on the code of rpart.
Bug reports should be directed to this package's maintainer, not rparts'.

[1] "neg" "pos"

itree documentation built on May 2, 2019, 7:25 a.m.

itree index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

itree
Tools for classification and regression trees, with an emphasis on interpretability.

itree: Recursive Partitioning and Regression Trees a la rpart, but...
In itree: Tools for classification and regression trees, with an emphasis on interpretability.

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Example output

Related to itree in itree...

R Package Documentation

Browse R Packages

We want your feedback!

itree Tools for classification and regression trees, with an emphasis on interpretability.

itree: Recursive Partitioning and Regression Trees a la rpart, but... In itree: Tools for classification and regression trees, with an emphasis on interpretability.

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Example output

Related to itree in itree...

R Package Documentation

Browse R Packages

We want your feedback!

itree
Tools for classification and regression trees, with an emphasis on interpretability.

itree: Recursive Partitioning and Regression Trees a la rpart, but...
In itree: Tools for classification and regression trees, with an emphasis on interpretability.