Description Usage Arguments Details Author(s) Examples
Method to plot an object of forestFloor-class. Plot partial feature contributions of the most important variables. Colour gradients can be applied two show possible interactions.
1 2 3 4 5 6 7 8 9 10 |
x |
foretFloor-object, also abbrivated ff..
Computed topology of randomForest-model, the output from the forestFloor function |
plot_seq |
a numeric vector describing which variables and in what sequence to plot, ordered by importance as default, order_by_importance = F then by feature/coloumn order of training data. |
limitY |
TRUE/FLASE, constrain all Yaxis to same limits to ensure relevance of low importance features is not overinterpreted |
order_by_importance |
TRUE / FALSE should plotting and plot_seq be ordered after importance. Most important feature plot first(TRUE) |
cropXaxes |
a vector of indice numbers of which zooming of x.axis should look away from outliers |
crop_limit |
a number often between 1.5 and 5, referring limit in std.devs from the mean defining outliers if limit = 2, above selected plots will zoom to +/- 2 std.dev of the respective features. |
plot_GOF |
Booleen TRUE/FALSE. Should the goodness of fit be plotted as a line? |
GOF_col |
Color of plotted GOF line |
... |
... other arguments passed to generic plot functions |
The method plot.forestFloor visualizes partial plots of the most important variables first. Partial dependence plots are available in the randomForest package. But such plots are single lines(1d-slices) and do not answer the question:
Is this partial function(PF) a fair generalization or subject to global or local interactions.
Soren Havelund Welling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | ## Not run:
#simulate data
obs=1000
vars = 6
X = data.frame(replicate(vars,rnorm(obs)))
Y = with(X, X1^2 + sin(X2*pi) + 2 * X3 * X4 + 0.5 * rnorm(obs))
#grow a forest, remeber to include inbag
rfo=randomForest::randomForest(X,Y,keep.inbag=TRUE)
#compute topology
ff = forestFloor(rfo,X)
#print forestFloor
print(ff)
#plot partial functions of most important variables first
plot(ff,order_by_importance=TRUE)
#Non interacting functions are well displayed, whereas X3 and X4 are not
#by applying different colourgradient, interactions reveal themself
#also a k-nearest neighbor fit is applied to evaluate goodness of fit
Col=fcol(ff,3,orderByImportance=FALSE)
plot(ff,col=Col,plot_GOF=TRUE)
#if needed, k-nearest neighbor parameters for goodness-of-fit can be access through convolute_ff
#a new fit will be calculated and added to forstFloor object as ff$FCfit
ff = convolute_ff(ff,userArgs.kknn=alist(kernel="epanechnikov",kmax=5))
plot(ff,col=Col,plot_GOF=TRUE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.