predict.rfArb | R Documentation |
Prediction and test using Rborist.
## S3 method for class 'rfArb'
predict(object, newdata, sampler, yTest=NULL,
keyedFrame = FALSE, quantVec=numeric(0), quantiles = length(quantVec) > 0,
ctgCensus = "votes", indexing = FALSE, trapUnobserved = FALSE,
bagging = FALSE, nThread = 0, verbose = FALSE, ...)
object |
an object of class |
newdata |
a design frame or matrix containing new data, with the same signature of predictors as in the training command. |
sampler |
an object of class |
yTest |
a response vector against which to test the new predictions. |
keyedFrame |
whether the columns of |
quantVec |
a vector of quantiles to predict. |
quantiles |
whether to predict quantiles. |
ctgCensus |
whether/how to summarize per-category predictions. "votes" specifies the number of trees predicting a given class. "prob" specifies a normalized, probabilistic summary. "probSample" specifies sample-weighted probabilities, similar to quantile histogramming. |
indexing |
whether to record the final node index, typically terminal, of tree traversal. |
trapUnobserved |
reports score for nonterminal upon encountering values not observed during training, such as missing data. |
bagging |
whether prediction is restricted to out-of-bag samples. |
nThread |
suggests ans OpenMP-style thread count. Zero denotes default processor setting. |
verbose |
whether to output progress of prediction. |
... |
not currently used. |
an object of one of two classes:
SummaryReg
summarizing regression, consisting of:
prediction
an object of class PredictReg
consisting of:
yPred
the estimated numerical response.
qPred
quantiles of prediction, if requested.
qEst
quantile of the estimate, if quantiles requested.
indices
final index of prediction, if requested.
validation
if validation requested, an object of class ValidReg
consisting of:
mse
the mean-squared error of the estimate.
rsq
the r-squared statistic of the estimate.
mae
the mean absolute error of the estimate.
importance
if permution importance requested, an object of class importanceReg
, containing multiple instances of:
names
the predictor names.
mse
the per-predictor mean-squared error, under permutation.
SummaryCtg
summarizing classification, consisting of:
PredictCtg
consisting of:
yPred
estimated categorical response.
census
factor-valued matrix of the estimate, by category, if requested.
prob
matrix of estimate probabilities, by category, if requested.
indices
final index of prediction, if requested.
validation
if validation requested, an object of class ValidCtg
consisting of:
confusion
the confusion matrix.
misprediction
the misprediction rate.
oobError
the out-of-bag error.
importance
if permution importance requested, an object of class importanceCtg
, consisting of:
mispred
the misprediction rate, by predictor.
oobErr
the out-of-bag error, by predictor.
Mark Seligman at Suiji.
rfTrain
## Not run:
# Regression example:
nRow <- 5000
x <- data.frame(replicate(6, rnorm(nRow)))
y <- with(x, X1^2 + sin(X2) + X3 * X4) # courtesy of S. Welling.
pf <- preformat(x)
sp <- presample(y)
rb <- rfArb(pf, sp, y)
# Performs separate prediction on new data:
xx <- data.frame(replace(6, rnorm(nRow)))
pred <- predict(rb, xx)
yPred <- pred$yPred
rb <- Rborist(x,y)
# Performs separate prediction on new data:
xx <- data.frame(replacate(6, rnorm(nRow)))
pred <- predict(rb, xx)
yPred <- pred$yPred
# As above, but also records final indices of each tree walk:
#
pred <- predict(rb, xx, indexing=TRUE)
print(pred$indices[c(1:2), ])
# As above, but predicts over \code{newdata} with unobserved values.
# In the case of numerical data, only missing values are considered
# unobserved. Missing values are encoded as \code{NaN}, which are
# incomparable, precipitating \code{false} on every test. Prediction
# therefore takes the \code{false} branch when encountering missing
# values:
#
xxMissing <- xx
xxMissing[6, c(15, 32, 87, 101)] <- NA
pred <- predict(rb, xxMissing)
# As above, but returns a nonterminal score upon encountering
# unobserved values. Neither the true nor the false branch from the
# testing node is taken. Instead, the score returned is derived
# from all leaf nodes (terminals) reached by the testing
# (nonterminal) node.
#
pred <- predict(rb, xxMissing, trapUnobserved = TRUE)
# Performs separate prediction, using original response as test
# vector:
pred <- predict(rb, xx, y)
mse <- pred$mse
rsq <- pred$rsq
# Performs separate prediction with (default) quantiles:
pred <- predict(rb, xx, quantiles="TRUE")
qPred <- pred$qPred
# Performs separate prediction with deciles:
pred <- predict(rb, xx, quantVec = seq(0.1, 1.0, by = 0.10))
qPred <- pred$qPred
# Classification examples:
data(iris)
rb <- Rborist(iris[-5], iris[5])
# Generic prediction using training set.
# Census as (default) votes:
pred <- predict(rb, iris[-5])
yPred <- pred$yPred
census <- pred$census
# Using the \code{keyedFrame} option allows the columns of
# \code{newdata} to appear in arbitrary order, so long as the
# columns present during training appear as a subset:
#
pred <- predict(rb, iris[c(2, 4, 3, 1)], keyedFrame=TRUE)
# As above, but validation census to report class probabilities:
pred <- predict(rb, iris[-5], ctgCensus="prob")
prob <- pred$prob
# As above, but with training reponse as test vector:
pred <- predict(rb, iris[-5], iris[5], ctgCensus = "prob")
prob <- pred$prob
conf <- pred$confusion
misPred <- pred$misPred
# As above, but predicts nonterminal when encountering categories
# not observed during training. That is, prediction returns a score
# derived from all terminal nodes (leaves) reached from the
# (nonterminal) testing node.
#
# In this case, "unobserved" refers to categories not present in
# the subpartition over which a splitting is performed. As training
# partitions the data into smaller and smaller regions, a given
# category becomes less likely to appear in a region.
#
# More generally, unobserved data can include missing predictors as
# well as categories appearing in \code{newdata} which were not
# present during training.
#
pred <- predict(rb, trapUnobserved=TRUE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.