forestFloor: visualize the randomForest topology
forrestFloor visualizes cross-validated topology-maps of randomForests(RF). Package enables users to understand a non-linear, regression problem or a binary classification problem through RF. In all, this package is intended to provide a fast overview of dynamics within a given system of interest, allowing the user to decide for apropiate further modeling maybe within a classical statistical framework or to stay within the RF-modeling and look deep into the alluring topology of correlations and local interactions.
Soren Havelund Welling
Interpretation of QSAR Models Based on Random Forest Methods, http://dx.doi.org/10.1002/minf.201000173
Interpreting random forest classification models using a feature contribution method, http://arxiv.org/abs/1312.1121
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
## Not run: rm(list=ls()) library(forestFloorStable) #simulate data obs=2500 vars = 6 X = data.frame(replicate(vars,rnorm(obs))) Y = with(X, X1^2 + sin(X2*pi) + 2 * X3 * X4 + 1 * rnorm(obs)) #grow a forest, remeber to include inbag rfo=randomForest(X,Y,keep.inbag = TRUE,sampsize=1500,ntree=500) #compute topology ff = forestFloor(rfo,X) #print forestFloor print(ff) #plot partial functions of most important variables first plot(ff) #Non interacting functions are well displayed, whereas X3 and X4 are not #by applying different colourgradient, interactions reveal themself Col = fcol(ff,3,orderByImportance=FALSE) plot(ff,col=Col) #in 3D the interaction between X3 and X reveals itself completely show3d_new(ff,3:4,col=Col,plot.rgl=list(size=5)) #although no interaction, a joined additive effect of X1 and X2 #colour by FC-component FC1 and FC2 summed Col = fcol(ff,1:2,X.m=FALSE,RGB=TRUE,orderByImportance=FALSE) plot(ff,col=Col) show3d_new(ff,1:2,col=Col,plot.rgl=list(size=5)) #...or two-way gradient is formed from FC-component X1 and X2. Col = fcol(ff,1:2,X.matrix=TRUE,alpha=0.8,orderByImportance=FALSE) plot(ff,col=Col) show3d_new(ff,1:2,col=Col,plot.rgl=list(size=5)) ## End(Not run)