predict.randomForest | R Documentation |

Prediction of test data using random forest.

## S3 method for class 'randomForest' predict(object, newdata, type="response", norm.votes=TRUE, predict.all=FALSE, proximity=FALSE, nodes=FALSE, cutoff, ...)

`object` |
an object of class |

`newdata` |
a data frame or matrix containing new data. (Note: If
not given, the out-of-bag prediction in |

`type` |
one of |

`norm.votes` |
Should the vote counts be normalized (i.e.,
expressed as fractions)? Ignored if |

`predict.all` |
Should the predictions of all trees be kept? |

`proximity` |
Should proximity measures be computed? An error is
issued if |

`nodes` |
Should the terminal node indicators (an n by ntree matrix) be return? If so, it is in the “nodes” attribute of the returned object. |

`cutoff` |
(Classification only) A vector of length equal to
number of classes. The ‘winning’ class for an observation is the
one with the maximum ratio of proportion of votes to cutoff.
Default is taken from the |

`...` |
not used currently. |

If `object$type`

is `regression`

, a vector of predicted
values is returned. If `predict.all=TRUE`

, then the returned
object is a list of two components: `aggregate`

, which is the
vector of predicted values by the forest, and `individual`

, which
is a matrix where each column contains prediction by a tree in the
forest.

If `object$type`

is `classification`

, the object returned
depends on the argument `type`

:

`response` |
predicted classes (the classes with majority vote). |

`prob` |
matrix of class probabilities (one column for each class and one row for each input). |

`vote` |
matrix of vote counts (one column for each class
and one row for each new input); either in raw counts or in fractions
(if |

If `predict.all=TRUE`

, then the `individual`

component of the
returned object is a character matrix where each column contains the
predicted class by a tree in the forest.

If `proximity=TRUE`

, the returned object is a list with two
components: `pred`

is the prediction (as described above) and
`proximity`

is the proximitry matrix. An error is issued if
`object$type`

is `regression`

.

If `nodes=TRUE`

, the returned object has a “nodes” attribute,
which is an n by ntree matrix, each column containing the node number
that the cases fall in for that tree.

NOTE: If the `object`

inherits from `randomForest.formula`

,
then any data with `NA`

are silently omitted from the prediction.
The returned value will contain `NA`

correspondingly in the
aggregated and individual tree predictions (if requested), but not in
the proximity or node matrices.

NOTE2: Any ties are broken at random, so if this is undesirable, avoid it by
using odd number `ntree`

in `randomForest()`

.

Andy Liaw andy_liaw@merck.com and Matthew Wiener matthew_wiener@merck.com, based on original Fortran code by Leo Breiman and Adele Cutler.

Breiman, L. (2001), *Random Forests*, Machine Learning 45(1),
5-32.

`randomForest`

data(iris) set.seed(111) ind <- sample(2, nrow(iris), replace = TRUE, prob=c(0.8, 0.2)) iris.rf <- randomForest(Species ~ ., data=iris[ind == 1,]) iris.pred <- predict(iris.rf, iris[ind == 2,]) table(observed = iris[ind==2, "Species"], predicted = iris.pred) ## Get prediction for all trees. predict(iris.rf, iris[ind == 2,], predict.all=TRUE) ## Proximities. predict(iris.rf, iris[ind == 2,], proximity=TRUE) ## Nodes matrix. str(attr(predict(iris.rf, iris[ind == 2,], nodes=TRUE), "nodes"))

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.