Linear Model Trees
Description
Modelbased recursive partitioning based on least squares regression.
Usage
1 
Arguments
formula 
symbolic description of the model (of type

data, subset, na.action 
arguments controlling formula processing
via 
weights 
optional numeric vector of weights. By default these are
treated as case weights but the default can be changed in

offset 
optional numeric vector with an a priori known component to be
included in the model 
cluster 
optional vector (typically numeric or factor) with a cluster ID to be employed for clustered covariances in the parameter stability tests. 
... 
optional control parameters passed to

Details
Convenience interface for fitting MOBs (modelbased recursive partitions) via
the mob
function. lmtree
internally sets up a model
fit
function for mob
, using either lm.fit
or lm.wfit
(depending on whether weights are used or not).
Then mob
is called using the residual sum of squares as the objective
function.
Compared to calling mob
by hand, the implementation tries to avoid
unnecessary computations while growing the tree. Also, it provides a more
elaborate plotting function.
Value
An object of class lmtree
inheriting from modelparty
.
The info
element of the overall party
and the individual
node
s contain various informations about the models.
References
Zeileis A, Hothorn T, Hornik K (2008). ModelBased Recursive Partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514.
See Also
mob
, mob_control
, glmtree
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  if(require("mlbench")) {
## Boston housing data
data("BostonHousing", package = "mlbench")
BostonHousing < transform(BostonHousing,
chas = factor(chas, levels = 0:1, labels = c("no", "yes")),
rad = factor(rad, ordered = TRUE))
## linear model tree
bh_tree < lmtree(medv ~ log(lstat) + I(rm^2)  zn +
indus + chas + nox + age + dis + rad + tax + crim + b + ptratio,
data = BostonHousing, minsize = 40)
## printing whole tree or individual nodes
print(bh_tree)
print(bh_tree, node = 7)
## plotting
plot(bh_tree)
plot(bh_tree, tp_args = list(which = "log(lstat)"))
plot(bh_tree, terminal_panel = NULL)
## estimated parameters
coef(bh_tree)
coef(bh_tree, node = 9)
summary(bh_tree, node = 9)
## various ways for computing the mean squared error (on the training data)
mean((BostonHousing$medv  fitted(bh_tree))^2)
mean(residuals(bh_tree)^2)
deviance(bh_tree)/sum(weights(bh_tree))
deviance(bh_tree)/nobs(bh_tree)
## loglikelihood and information criteria
logLik(bh_tree)
AIC(bh_tree)
BIC(bh_tree)
## (Note that this penalizes estimation of error variances, which
## were treated as nuisance parameters in the fitting process.)
## different types of predictions
bh < BostonHousing[c(1, 10, 50), ]
predict(bh_tree, newdata = bh, type = "node")
predict(bh_tree, newdata = bh, type = "response")
predict(bh_tree, newdata = bh, type = function(object) summary(object)$r.squared)
}
if(require("AER")) {
## Demand for economics journals data
data("Journals", package = "AER")
Journals < transform(Journals,
age = 2000  foundingyear,
chars = charpp * pages)
## linear regression tree (OLS)
j_tree < lmtree(log(subs) ~ log(price/citations)  price + citations +
age + chars + society, data = Journals, minsize = 10, verbose = TRUE)
## printing and plotting
j_tree
plot(j_tree)
## coefficients and summary
coef(j_tree, node = 1:3)
summary(j_tree, node = 1:3)
}
if(require("AER")) {
## Beauty and teaching ratings data
data("TeachingRatings", package = "AER")
## linear regression (WLS)
## null model
tr_null < lm(eval ~ 1, data = TeachingRatings, weights = students,
subset = credits == "more")
## main effects
tr_lm < lm(eval ~ beauty + gender + minority + native + tenure + division,
data = TeachingRatings, weights = students, subset = credits == "more")
## tree
tr_tree < lmtree(eval ~ beauty  minority + age + gender + division + native + tenure,
data = TeachingRatings, weights = students, subset = credits == "more",
caseweights = FALSE)
## visualization
plot(tr_tree)
## beauty slope coefficient
coef(tr_lm)[2]
coef(tr_tree)[, 2]
## Rsquared
1  deviance(tr_lm)/deviance(tr_null)
1  deviance(tr_tree)/deviance(tr_null)
}
