show3d: make 3d-plot of forestFloor topology

Description Usage Arguments Details Value Author(s) Examples

Description

2 features features(horizontal XY-plane) and one combined feature contribution (vertical Z-axis). Surface response layer will be estimated(gaussian-kNN by kknn package) and plotted alongside the datapoints. 3D grphic device is rgl. Will dispatch methods show3d.forestFloor for regression and show3d_forestFloor_multiClass for classification.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
## S3 method for class 'forestFloor_regression'
 show3d(
      x,
      Xi  = 1:2,
      FCi = NULL,
      col = "#12345678",    
      sortByImportance = TRUE,
      surface=TRUE,   
      combineFC = sum,  
      zoom=1.2,       
      grid.lines=30,  
      limit=3, 
      kknnGrid.args = alist(),  
      plot.rgl.args = alist(),  
      surf.rgl.args = alist(),
      user.gof.args = alist(),
      compute_GOF = TRUE,
      ...)

## S3 method for class 'forestFloor_multiClass'
show3d(
      x,
      Xi,
      FCi=NULL,
      label.seq=NULL,
      kknnGrid.args=list(NULL),
      plot.rgl.args=list(),
      compute_GOF=FALSE,
      user.gof.args=list(NULL),
      ...)
    

Arguments

x

forestFloor" class object

Xi

integer vector of length 2 indices of feature columns

FCi

integer vector of length 1 to p variables indices of feature contributions columns

col

colour vector points colour or colour palette, can also be passed as promise in plot.rgl.args

sortByImportance

booleen should indices count 'variable importance' order or matrix/data.frame order

surface

should a surface be plotted also

combineFC

how should feature contributions be combined

zoom

#grid can be expanded in all directions by a factor ,zoom

grid.lines

#how many grid lines should be used

limit

#sizing of grid does not concider outliers, outside limit of e.g. 3 sd deviations univariately

kknnGrid.args

argument list, any possiple arguments to kknnkknn
These default wrapper arguments can hereby be overwritten:
wrapper = alist( formula=fc~., # do not change
train=Data, # do not change
k=k, # integer < n_observations. k>100 may run slow.
kernel="gaussian", #distance kernel, other is e.g. kernel="triangular"
test=gridX #do not change
)
see kknnkknn to understand paremters. k is set by default automatically to a half times the square root of observations, which often gives a reasonable balance between robustness and apdeptness. k neighbors and distance kernel can be changed be passing kknnGrid.args = alist(k=5,kernel="triangular",scale=FALSE), hereby will default k and default kernel be overwritten. Moreover the scale argument was not specified by this wrapper and therefore not conflicting, the argument is simply appended.

plot.rgl.args

pass argument to rgl::plot3d, can override any argument of this wrapper, defines plotting space and plot points. See plot3d for documentation of graphical arguments.

wrapper_arg = alist( x=xaxis, #do not change, x coordinates
y=yaxis, #do not change, y coordinates
z=zaxis, #do not change, z coordinates
col=col, #colouring evaluated within this wrapper function
xlab=names(X)[1], #xlab, label for x axis
ylab=names(X)[2], #ylab, label for y axis
zlab=paste(names(X[,FCi]),collapse=" - "), #zlab, label for z axis
alpha=.4, #points transparancy
size=3, #point size
scale=.7, #z axis scaling
avoidFreeType = T, #disable freeType=T plug-in. (Postscript labels)
add=FALSE #do not change, should graphics be added to other rgl-plot?
)

surf.rgl.args

wrapper_arg = alist( x=unique(grid[,2]), #do not change, values of x-axis
y=unique(grid[,3]), #do not change, values of y-axis
z=grid[,1], #do not change, response surface values
add=TRUE, #do not change, surface added to plotted points
alpha=0.4 #transparency of surface, [0;1]
)
see rgl::persp3d for other graphical arguments notice the surface is added onto plotting of points, thus can e.g. labels not be changed from here.

label.seq

a numeric vector describing which classes and in what sequence to plot. NULL is all classes ordered is in levels in x$Y of forestFloor_mulitClass object x.

user.gof.args

argument list passed to internal function ff2, which can modify how goodness-of-fit is computed. Number of neighobers and kernel can be set manually with e.g. list(kmax=40,kernel="gaussion"). Default pars should work already in most cases. Function ff2 computed leave-one-out CV prediction the feature contributions from the chosen context of the visualization.

compute_GOF

Booleen TRUE/FALSE. Should the goodness of fit be computed and plotted is main of 3D plot? If false, no GOF input pars are useful.

...

not used at the moment

Details

show3d plot one or more combined feature contributions in the context of two features with points representing each data point. The input object must be a "forestFloor_regression" or "forestFloor_multiClass" S3 class object , and should at least contain $X the data.frame of training data, $FCmatrix the feature contributions matrix. Usually this object are formed with the function forestFloor having a random forest model fit as inpu. Actual visualization differs for each class.

Value

no value

Author(s)

Soren Havelund Welling

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
## Not run: 
rm(list=ls())
library(forestFloor)
#simulate data
obs=2500
vars = 6 

X = data.frame(replicate(vars,rnorm(obs)))
Y = with(X, X1^2 + sin(X2*pi) + 2 * X3 * X4 + 1 * rnorm(obs))


#grow a forest, remeber to include inbag
rfo=randomForest(X,Y,keep.inbag = TRUE,sampsize=1500,ntree=500)

#compute topology
ff = forestFloor(rfo,X)


#print forestFloor
print(ff) 

#plot partial functions of most important variables first
plot(ff) 

#Non interacting functions are well displayed, whereas X3 and X4 are not
#by applying different colourgradient, interactions reveal themself 
Col = fcol(ff,3)
plot(ff,col=Col) 

#in 3D the interaction between X3 and X reveals itself completely
show3d(ff,3:4,col=Col,plot.rgl=list(size=5)) 

#although no interaction, a joined additive effect of X1 and X2
Col = fcol(ff,1:2,X.m=FALSE,RGB=TRUE) #colour by FC-component FC1 and FC2 summed
plot(ff,col=Col) 
show3d(ff,1:2,col=Col,plot.rgl=list(size=5)) 

#...or two-way gradient is formed from FC-component X1 and X2.
Col = fcol(ff,1:2,X.matrix=TRUE,alpha=0.8) 
plot(ff,col=Col) 
show3d(ff,1:2,col=Col,plot.rgl=list(size=5))


## End(Not run)

forestFloor documentation built on May 2, 2019, 4:46 p.m.