Description Usage Arguments Value Note Author(s) References See Also Examples
Plot variable importances from random permutations of class labels and the variable importances from the original data set.
1 2 3 4 5 6 7 8 9 10 11 |
randomImportances |
A list with a structure such as the object
return by |
.
forest |
A random forest fitted to the original data. This forest
must have been fitted with |
whichImp |
The importance measue to use. One (only one) of
|
nvars |
If NULL will show the plot for the complete range of variables. If an integer, will plot only the most important nvars. |
show.var.names |
If TRUE, show the variable names in the plot. Unless you are plotting few variables, it probably won't be of any use. |
vars.highlight |
A vector indicating the variables to highlight in the plot with a vertical blue segment. You need to pass here a vector of variable names, not variable positions. |
main |
The title for the plot. |
screeRandom |
If TRUE, order all the variable importances (i.e., those from both the original and the permuted class labels data sets) from largest to smallest before plotting. The plot will thus resemble a usual "scree plot". |
lwdBlack |
The width of the line to use for the importances from the original data set. |
lwdRed |
The width of the line to use for the average of the importances for the permuted data sets. |
lwdLightblue |
The width of the line for the importances for the individual permuted data sets. |
cexPoint |
|
overlayTrue |
If TRUE, the variable importance from the original data set will be plotted last, so you can see it even if buried in the middle of many gree lines; can be of help when the plot does not allow you to see the black line. |
xlab |
The title for the x-axis (see |
ylab |
The title for the y-axis (see |
... |
Additional arguments to plot. |
Only used for its side effects of producing plots. In particular, you will see lines of three colors:
black |
Connects the variable importances from the original simulated data. |
green |
Connect the variable
importances from the data sets with permuted class labels; there
will be as many lines as |
red |
Connects the average of the importances from the permuted data sets. |
Additionally, if you used a valid set of values for
vars.highlight
, these will be shown with a vertical blue
segment.
These plots resemble the scree plots commonly used with principal component analysis, and the actual choice of colors was taken from the importance spectrum plots of Friedman \& Meulman.
Ramon Diaz-Uriarte rdiaz02@gmail.com
Breiman, L. (2001) Random forests. Machine Learning, 45, 5–32.
Diaz-Uriarte, R. , Alvarez de Andres, S. (2005) Variable selection from random forests: application to gene expression data. Tech. report. http://ligarto.org/rdiaz/Papers/rfVS/randomForestVarSel.html
Friedman, J., Meulman, J. (2005) Clustering objects on subsets of attributes (with discussion). J. Royal Statistical Society, Series B, 66, 815–850.
randomForest
,
varSelRF
,
varSelRFBoot
,
varSelImpSpecRF
,
randomVarImpsRF
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | x <- matrix(rnorm(45 * 30), ncol = 30)
x[1:20, 1:2] <- x[1:20, 1:2] + 2
colnames(x) <- paste0("V", seq.int(ncol(x)))
cl <- factor(c(rep("A", 20), rep("B", 25)))
rf <- randomForest(x, cl, ntree = 200, importance = TRUE)
rf.rvi <- randomVarImpsRF(x, cl,
rf,
numrandom = 20,
usingCluster = FALSE)
randomVarImpsRFplot(rf.rvi, rf)
op <- par(las = 2)
randomVarImpsRFplot(rf.rvi, rf, show.var.names = TRUE)
par(op)
## Not run:
## identical, but using a cluster
## make a small cluster, for the sake of illustration
psockCL <- makeCluster(2, "PSOCK")
clusterSetRNGStream(psockCL, iseed = 789)
clusterEvalQ(psockCL, library(varSelRF))
rf.rvi <- randomVarImpsRF(x, cl,
rf,
numrandom = 20,
usingCluster = TRUE,
TheCluster = psockCL)
randomVarImpsRFplot(rf.rvi, rf)
stopCluster(psockCL)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.