Model-based Recursive Partitioning

Share:

Description

MOB is an algorithm for model-based recursive partitioning yielding a tree with fitted models associated with each terminal node.

Usage

1
2
mob(formula, data, subset, na.action, weights, offset, cluster,
  fit, control = mob_control(), ...)

Arguments

formula

symbolic description of the model (of type y ~ z1 + ... + zl or y ~ x1 + ... + xk | z1 + ... + zl; for details see below).

data, subset, na.action

arguments controlling formula processing via model.frame.

weights

optional numeric vector of weights. By default these are treated as case weights but the default can be changed in mob_control.

offset

optional numeric vector with an a priori known component to be included in the model y ~ x1 + ... + xk (i.e., only when x variables are specified).

cluster

optional vector (typically numeric or factor) with a cluster ID to be passed on to the fit function and employed for clustered covariances in the parameter stability tests.

fit

function. A function for fitting the model within each node. For details see below.

control

A list with control parameters as returned by mob_control.

...

Additional arguments passed to the fit function.

Details

Model-based partitioning fits a model tree using two groups of variables: (1) The model variables which can be just a (set of) response(s) y or additionally include regressors x1, ..., xk. These are used for estimating the model parameters. (2) Partitioning variables z1, ..., zl, which are used for recursively partitioning the data. The two groups of variables are either specified as y ~ z1 + ... + zl (when there are no regressors) or y ~ x1 + ... + xk | z1 + ... + zl (when the model part contains regressors). Both sets of variables may in principle be overlapping.

To fit a tree model the following algorithm is used.

  1. fit a model to the y or y and x variables using the observations in the current node

  2. Assess the stability of the model parameters with respect to each of the partitioning variables z1, ..., zl. If there is some overall instability, choose the variable z associated with the smallest p value for partitioning, otherwise stop.

  3. Search for the locally optimal split in z by minimizing the objective function of the model. Typically, this will be something like deviance or the negative logLik.

  4. Refit the model in both kid subsamples and repeat from step 2.

More details on the conceptual design of the algorithm can be found in Zeileis, Hothorn, Hornik (2008) and some illustrations are provided in vignette("MOB"). For specifying the fit function two approaches are possible:

(1) It can be a function fit(y, x = NULL, start = NULL, weights = NULL, offset = NULL, ...). The arguments y, x, weights, offset will be set to the corresponding elements in the current node of the tree. Additionally, starting values will sometimes be supplied via start. Of course, the fit function can choose to ignore any arguments that are not applicable, e.g., if the are no regressors x in the model or if starting values or not supported. The returned object needs to have a class that has associated coef, logLik, and estfun methods for extracting the estimated parameters, the maximized log-likelihood, and the empirical estimating function (i.e., score or gradient contributions), respectively.

(2) It can be a function fit(y, x = NULL, start = NULL, weights = NULL, offset = NULL, ..., estfun = FALSE, object = FALSE). The arguments have the same meaning as above but the returned object needs to have a different structure. It needs to be a list with elements coefficients (containing the estimated parameters), objfun (containing the minimized objective function), estfun (the empirical estimating functions), and object (the fitted model object). The elements estfun, or object should be NULL if the corresponding argument is set to FALSE.

Internally, a function of type (2) is set up by mob() in case a function of type (1) is supplied. However, to save computation time, a function of type (2) may also be specified directly.

For the fitted MOB tree, several standard methods are provided such as print, predict, residuals, logLik, deviance, weights, coef and summary. Some of these rely on reusing the corresponding methods for the individual model objects in the terminal nodes. Functions such as coef, print, summary also take a node argument that can specify the node IDs to be queried. Some examples are given below.

More details can be found in vignette("mob", package = "partykit"). An overview of the connections to other functions in the package is provided by Hothorn and Zeileis (2015).

Value

An object of class modelparty inheriting from party. The info element of the overall party and the individual nodes contain various informations about the models.

References

Hothorn T, Zeileis A (2015). partykit: A Modular Toolkit for Recursive Partytioning in R. Journal of Machine Learning Research, 16, 3905–3909.

Zeileis A, Hothorn T, Hornik K (2008). Model-Based Recursive Partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514.

See Also

mob_control, lmtree, glmtree

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
if(require("mlbench")) {

## Pima Indians diabetes data
data("PimaIndiansDiabetes", package = "mlbench")

## a simple basic fitting function (of type 1) for a logistic regression
logit <- function(y, x, start = NULL, weights = NULL, offset = NULL, ...) {
  glm(y ~ 0 + x, family = binomial, start = start, ...)
}

## set up a logistic regression tree
pid_tree <- mob(diabetes ~ glucose | pregnant + pressure + triceps + insulin +
  mass + pedigree + age, data = PimaIndiansDiabetes, fit = logit)
## see lmtree() and glmtree() for interfaces with more efficient fitting functions

## print tree
print(pid_tree)

## print information about (some) nodes
print(pid_tree, node = 3:4)

## visualization
plot(pid_tree)

## coefficients and summary
coef(pid_tree)
coef(pid_tree, node = 1)
summary(pid_tree, node = 1)

## average deviance computed in different ways
mean(residuals(pid_tree)^2)
deviance(pid_tree)/sum(weights(pid_tree))
deviance(pid_tree)/nobs(pid_tree)

## log-likelihood and information criteria
logLik(pid_tree)
AIC(pid_tree)
BIC(pid_tree)

## predicted nodes
predict(pid_tree, newdata = head(PimaIndiansDiabetes, 6), type = "node")
## other types of predictions are possible using lmtree()/glmtree()
}