Calculate the proportion of variables used in tree splits, and average summary stats of tree heights and leaf sizes

Share:

Description

Calculates the proportion of particles which use each input to make a tree split and the proportion of all splits in trees of each particle that correspond to each input variable; also provides tree height and leaf size summary information

Usage

1
2
3
4
5
6
## S3 method for class 'dynaTree'
varpropuse(object)
## S3 method for class 'dynaTree'
varproptotal(object)
## S3 method for class 'dynaTree'
treestats(object)

Arguments

object

a "dynaTree"-class object built by dynaTree

Details

varpropuse gives the proportion of times a particle uses each input variable in a tree split; varproptotal gives the proportion of total uses by the tree in each particle (i.e., averaged over the total number of splits used in the tree).

Usually, varpropuse returns a vector of (nearly) all ones unless there are variables which are not useful in predicting the response. Using model = "linear" is not recommended for this sort of variable selection.

treestats returns the average tree height, and the average leaf size, both active and retired

Value

For varprop*, a vector of proportions of length ncol(object$X)) is returned; for treestats a 1-row, 4-column data.frame is returned

Author(s)

Robert B. Gramacy rbgramacy@chicagobooth.edu,
Matt Taddy taddy@chicagobooth.edu, and
Christoforos Anagnostopoulos christoforos.anagnostopoulos06@imperial.ac.uk

References

Gramacy, R.B., Taddy, M.A., and S. Wild (2011). “Variable Selection and Sensitivity Analysis via Dynamic Trees with an Application to Computer Code Performance Tuning” arXiv:1108.4739

http://bobby.gramacy.com/r_packages/dynaTree/

See Also

dynaTree, sens.dynaTree, relevance.dynaTree

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## ffit a dynaTree model to the Ozone data
X <- airquality[,2:4]
y <- airquality$Ozone
na <- apply(is.na(X), 1, any) | is.na(y)
out <- dynaTree(X=X[!na,], y=y[!na])

## obtain variable usage proportions
varpropuse(out)
varproptotal(out)

## gather relevance statistics which are more meaningful
out <- relevance(out)
boxplot(out$relevance)
abline(h=0, col=2, lty=2)

## obtain tree statistics
treestats(out)

## clean up
deletecloud(out)