trim.oblique.tree: Trims Oblique Splits of Fitted Oblique Tree Objects

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Determines a sequence of concise subtrees of the supplied tree by recursively “trimming” off the least important attributes used in oblique splits.

Usage

1
2
3
4
5
6
7
trim.oblique.tree(
	tree, 
	best = NULL, 
	newdata, 
	trim.impurity = c("deviance", "misclass"), 
	trim.depth = c("partial", "complete"), 
	eps = 1e-3)

Arguments

tree

Fitted model object of class oblique.tree. This is assumed to be the result of some function that produces an object with the same named components as that returned by oblique.tree.

best

Requests the complexity (i.e. 1 + number of attributes used throughout the tree) of the concise subtree of tree to return (best a scalar) or a (optional) sequence of concise subtrees (best a vector). If missing, best is determined algorithmically. If there is no tree in the sequence of the requested size, the next largest is returned.

newdata

Data frame upon which the sequence of cost-complexity subtrees is evaluated. If missing, the data used to grow the tree is used.

trim.impurity

Character string denoting the measure of node heterogeneity used to guide tree trimming. The default is deviance and the alternative is misclass (number of misclassifications or total loss).

trim.depth

A character string denoting if oblique splits should be trimmed towards axis-parallel splits partial or to the constant predictor complete.

eps

A lower bound for the probabilities, used to compute deviances if events of predicted probability zero occur in newdata.

Details

Determines a sequence of concise subtrees of the supplied tree by recursively "trimming" its splits, based upon the cost-complexity measure.

If best is supplied, the optimal subtree for that value is returned.

The response as well as the predictors referred to in the right side of the formula in tree must be present by name in newdata. These data are dropped down each tree in the trim sequence and deviances or losses calculated by comparing the supplied response to the prediction. A plot method exists for objects of this class. It displays the value of the deviance, the number of misclassifications or the total loss for each subtree in the trim sequence. An additional axis displays the values of the cost-complexity parameter at each subtree.

Value

If best is a scalar, a c("oblique.tree","tree") object of size best is returned. Otherwise an object of class c("trim", "trim.sequence") is returned. The object contains the following components:

comp

The complexity of each tree in the cost-complexity pruning sequence.

dev

Total deviance of each tree in the cost-complexity pruning sequence.

h

The value of the cost-complexity pruning parameter of each tree in the sequence.

Author(s)

A. Truong

References

Truong. A (2009) Fast Growing and Interpretable Oblique Trees via Probabilistic Models

See Also

oblique.tree.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#grow a tree on the Pima Indian dataset
data(Pima.tr, package = "MASS")
ob.tree <- oblique.tree(formula			= type~.,
			data			= Pima.tr,
			oblique.splits		= "only")
plot(ob.tree);text(ob.tree);title(main="Full Oblique Tree")

#partially trimming
#examine the tree sequence
trim.seq <- trim.oblique.tree(	tree		= ob.tree)	
print(trim.seq);plot(trim.seq)

#examine test error over the trim sequence
data(Pima.te, package = "MASS")
trim.seq <- trim.oblique.tree(	tree		= ob.tree,
				newdata		= Pima.te)
print(trim.seq);plot(trim.seq)

#deviance is least when best = 7
p.trimmed <- trim.oblique.tree(	tree		= ob.tree,
				best		= 7)
plot(p.trimmed);text(p.trimmed);title(main="Partially Trimmed Tree")

#complete trimming
#examine the tree sequence
trim.seq <- trim.oblique.tree(	tree		= ob.tree,
				trim.depth	= "complete")	
print(trim.seq);plot(trim.seq)

#examine test error over the trim sequence
data(Pima.te, package = "MASS")
trim.seq <- trim.oblique.tree(	tree		= ob.tree,
				trim.depth	= "complete",
				newdata	= Pima.te)
print(trim.seq);plot(trim.seq)

#deviance is least when best = 9
c.trimmed <- trim.oblique.tree(	tree		= ob.tree,
				best		= 9)
plot(c.trimmed);text(c.trimmed);title(main="Completely Trimmed Tree")

oblique.tree documentation built on April 15, 2017, 4:38 a.m.