Convolute SPECIFIC set of featureContributions by corresponding features with kknn-package.

Share:

Description

Only one convolution - but any kind. Whereas convolute_ff rather do batches of first order convolution.

Usage

1
2
3
4
5
convolute_ff2(ff,
              Xvars,
              FCvars = NULL,
              k.fun=function() round(sqrt(n.obs)/2),
              userArgs.kknn = alist(kernel="gaussian")            )

Arguments

ff

forestFloor object(class="forestFloor") concisting of at least ff$X and ff$FCmatrix with two matrices of equal size

Xvars

integer vector, of col indices of ff$X to convolute by

FCvars

integer vector, of col indices of ff$FCmatrix. Those feature contributions to conbine(sum) and convolute.

k.fun

function to define k-neighbors to concider. n.obs is a constant as number of observations in ff$X. Hereby k neighbors is defined as a function k.fun of n.obs. To set k to a constant use e.g. k.fun = function() 10. k can also be overridden with userArgs.kknn = alist(kernel="gaussian",kmax=10).

userArgs.kknn

argument list to pass to train.kknn function for each convolution. See (link) kknn.args. arguments in this list have priority of any passed by default by this wrapper function. see argument merger append.overwrite.alists

Details

convolute_ff2 is a wrapper of train.kknn from kknn package to convolute featureContributions by their corresponding features. The output depends on paremets passer

Value

an numeric vector of with the convoluted value for any observation

Author(s)

Soren Havelund Welling

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Not run: 
#simulate data
obs=2500
vars = 6 
X = data.frame(replicate(vars,rnorm(obs)))
Y = with(X, X1^2 + 2*sin(X2*pi) + 8 * X3 * X4)
Yerror = 15 * rnorm(obs)
cor(Y,Y+Yerror)^2  #relatively noisy system
Y= Y+Yerror

#grow a forest, remeber to include inbag
rfo=randomForest(X,Y,keep.inbag=TRUE,ntree=1000,sampsize=800)

#obtain 
ff = forestFloor(rfo,X)

#convolute the interacting feature contributions by their feature to understand relationship
fc34_convoluted = convolute_ff2(ff,Xvars=3:4,FCvars=3:4,  #arguments for the wrapper
                  userArgs.kknn = alist(kernel="gaussian",k=25)) #arguments for train.kknn

#plot the joined convolution
plot3d(ff$X[,3],ff$X[,4],fc34_convoluted,
       main="convolution of two feature contributions by their own vaiables",
       #add some colour gradients to ease visualization
       #box.outliers squese all observations in a 2 std.dev box
       #univariately for a vector or matrix and normalize to [0;1]
       col=rgb(.7*box.outliers(fc34_convoluted), 
               .7*box.outliers(ff$X[,3]),        
               .7*box.outliers(ff$X[,4]))
       )

## End(Not run)