Calculate the proportion of variables used in tree splits, and average summary stats of tree heights and leaf sizes
Calculates the proportion of particles which use each input to make a tree split and the proportion of all splits in trees of each particle that correspond to each input variable; also provides tree height and leaf size summary information
1 2 3 4 5 6
varpropuse gives the proportion of times a particle
uses each input variable in a tree split;
the proportion of total uses by the tree in each particle (i.e.,
averaged over the total number of splits used in the tree).
varpropuse returns a vector of (nearly) all ones
unless there are variables which are not useful in predicting
the response. Using
model = "linear" is not recommended
for this sort of variable selection.
treestats returns the average tree height, and the average
leaf size, both active and retired
vector of proportions of length
ncol(object$X)) is returned;
treestats a 1-row, 4-column
Gramacy, R.B., Taddy, M.A., and S. Wild (2011). “Variable Selection and Sensitivity Analysis via Dynamic Trees with an Application to Computer Code Performance Tuning” arXiv:1108.4739
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
## ffit a dynaTree model to the Ozone data X <- airquality[,2:4] y <- airquality$Ozone na <- apply(is.na(X), 1, any) | is.na(y) out <- dynaTree(X=X[!na,], y=y[!na]) ## obtain variable usage proportions varpropuse(out) varproptotal(out) ## gather relevance statistics which are more meaningful out <- relevance(out) boxplot(out$relevance) abline(h=0, col=2, lty=2) ## obtain tree statistics treestats(out) ## clean up deletecloud(out)
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.