Description Usage Arguments Details Value Author(s) Examples

View source: R/forestFloor_source.R

This colour module colour observations by selected variables. PCA decomposes a selection more than three variables. Space can be inflated by random forest variable importance, to focus colouring on influential variables. Outliers(>3std.dev) are automatically supressed. Any colouring can be modified.

1 2 3 4 5 |

`ff` |
a obejct of class "forestFloor" or a matrix or a data.frame. No missing values. No factors(for now). |

`cols` |
vector of indices of columns to colour by, will refer to ff$X if X.matrix=T and else ff$FCmatrix. If ff itself is a matrix or data.frame, indices will refer to these coloums |

`orderByImportance` |
logical, should cols refer to X column order or columns sorted by variable importance. Input must be of forestFloor -class to use this. Set to FALSE if no importance sorting is wanted. Otherwise leave as is. |

`X.matrix` |
logical, true will use feature matrix false will use feature contribution matrix. Only relvant if input is forestFloor object. |

`hue` |
value within [0,1], hue=1 will be exactly as hue = 0 colour wheel settings, will skew the colour of all observations without changing the contrast between any two given observations. |

`saturation` |
value within [0,1], mean saturation of colours, 0 is greytone and 1 is maximal colourfull. |

`brightness` |
value within [0,1], mean brightness of colours, 0 is black and 1 is lightly colours. |

`hue.range` |
value within [0,1], ratio of colour wheel, small value is small slice of colour whell those little variation in colours. 1 is any possible colour except for RGB colour system. |

`sat.range` |
value within [0,1], for colouring of 2 or more variables, a range of saturation is needed to obtain more degrees of freedom in the colour system. But as saturation of is preferred to be >.75 the range of saturation cannot here exceed .5. If NULL sat.range will set widest possible without exceeding range. |

`bri.range` |
value within [0,1], for colouring of 3 or more variables, a range of brightness is needed to obtain more degrees of freedom in the colour system. But as brightness of is preferred to be >.75 the range of saturation cannot here exceed .5. If NULL bri.range will set widest possible without exceeding range. |

`alpha` |
value within [0;1] transparency of colours. |

`RGB` |
logical TRUE/FALSE, |

`max.df` |
integer 1, 2, or 3 only. Only for true-colour-system, the maximal allowed degrees of freedom in a colour scale. If more variables selected than max.df, PCA decompose to request degrees of freedom. max.df = 1 will give more simple colour gradients |

`imp.weight` |
Logical?, Should importance from a forestFloor object be used to weight selected variables? obviously not possible if input ff is a matrix or data.frame. If randomForest(importance=TRUE) during training, variable importance will be used. Otherwise the more unreliable gini_importance coefficient. |

`imp.exp` |
exponent to modify influence of imp.weight. 0 is not influence. -1 is counter influence. 1 is linear influence. .5 is square root influence etc.. |

`outlier.lim` |
number from 0 to Inf. Any observation which univariately exceed this limit will be suppressed, as if it actually where on this limit. Normal limit is 3 standard deviations. Extreme outliers can otherwise reserve alone a very large part of a given linear colour gradient. This leeds to visulization where outlier have one colour and any other observation another but same colour. |

`RGB.exp` |
value between ]1;>1]. Defines steepness of the gradient of the RGB colour system Close to one green midle area is missing. For values higher than 2, green area is dominating |

fcol produces colours for any observation. These are used plotting.

a character vector specifying the colour of any observations. Each elements is something like "#F1A24340", where F1 is the hexadecimal of the red colour, then A2 is the green, then 43 is blue and 40 is transparency.

Soren Havelund Welling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | ```
## Not run:
library(forestFloorStable)
obs=4000
vars = 6
X = data.frame(replicate(vars,rnorm(obs)))
Y = with(X, X1^2 + sin(X2*pi) + 2 * X3 * X4 + 0.5 * rnorm(obs))
#grow a forest, remeber to include inbag
rfo=randomForest::randomForest(X,Y,keep.inbag=TRUE,
importance=TRUE,sampsize=700)
#compute topology
ff = forestFloor(rfo,X)
#print forestFloor
print(ff)
#plot partial functions of most important variables first
Colours1=fcol(ff,1)
plot(ff,plot_seq=NULL,col=Colours1)
#try to colour by first four variables, uses PCA du reduce system to 3-way gradient
# (2.5 way more exactly as saturation and brightness by default have very limited ranges
# to avoid gray - or overexposed color tones).
Colours2=fcol(ff,1:4)
plot(ff,plot_seq=NULL,external.col=Colours2)
## End(Not run)
``` |

forestFloorStable documentation built on May 31, 2017, 3:19 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.