rfArb | R Documentation |
Accelerated implementation of the Random Forest (trademarked name) algorithm. Tuned for multicore and GPU hardware. Bindable with most numerical front-end languages in addtion to R. Invocation is similar to that provided by "randomForest" package.
## Default S3 method:
rfArb(x,
y,
autoCompress = 0.25,
ctgCensus = "votes",
classWeight = NULL,
impPermute = 0,
indexing = FALSE,
maxLeaf = 0,
minInfo = 0.01,
minNode = if (is.factor(y)) 2 else 3,
nLevel = 0,
nSamp = 0,
nThread = 0,
nTree = 500,
noValidate = FALSE,
predFixed = 0,
predProb = 0.0,
predWeight = NULL,
quantVec = NULL,
quantiles = !is.null(quantVec),
regMono = NULL,
rowWeight = NULL,
splitQuant = NULL,
thinLeaves = is.factor(y) && !indexing,
trapUnobserved = FALSE,
treeBlock = 1,
verbose = FALSE,
withRepl = TRUE,
...)
x |
the design matrix expressed as a |
y |
the response (outcome) vector, either numerical or
categorical. Row count must conform with |
autoCompress |
plurality above which to compress predictor values. |
ctgCensus |
report categorical validation by vote or by probability. |
classWeight |
proportional weighting of classification categories. |
impPermute |
number of importance permutations: 0 or 1. |
indexing |
whether to report final index, typically terminal, of tree traversal. |
maxLeaf |
maximum number of leaves in a tree. Zero denotes no limit. |
minInfo |
information ratio with parent below which node does not split. |
minNode |
minimum number of distinct row references to split a node. |
nLevel |
maximum number of tree levels to train. Zero denotes no limit. |
nSamp |
number of rows to sample, per tree. |
nThread |
suggests an OpenMP-style thread count. Zero denotes the default processor setting. |
nTree |
the number of trees to train. |
noValidate |
whether to train without validation. |
predFixed |
number of trial predictors for a split ( |
predProb |
probability of selecting individual predictor as trial splitter. |
predWeight |
relative weighting of individual predictors as trial splitters. |
quantVec |
quantile levels to validate. |
quantiles |
whether to report quantiles at validation. |
regMono |
signed probability constraint for monotonic regression. |
rowWeight |
row weighting for initial sampling of tree. |
splitQuant |
(sub)quantile at which to place cut point for numerical splits |
.
thinLeaves |
bypasses creation of leaf state in order to reduce memory footprint. |
trapUnobserved |
reports score for nonterminal upon encountering values not observed during training, such as missing data. |
treeBlock |
maximum number of trees to train during a single level (e.g., coprocessor computing). |
verbose |
indicates whether to output progress of training. |
withRepl |
whether row sampling is by replacement. |
... |
not currently used. |
an object of class rfArb
, a list containing the
following items:
sampler |
An object of class |
leaf |
An object of class |
forest |
An object of class |
predMap |
A vector of integers mapping internal to front-end predictor indices. |
signature |
An object of class |
training |
A list summarizing the training task, consisting of
the following fields:
|
prediction |
An object of class |
validation |
An object of class |
importance |
An object of class |
Mark Seligman at Suiji.
Rborist
## Not run:
# Regression example:
nRow <- 5000
x <- data.frame(replicate(6, rnorm(nRow)))
y <- with(x, X1^2 + sin(X2) + X3 * X4) # courtesy of S. Welling.
# Classification example:
data(iris)
# Generic invocation:
rb <- rfArb(x, y)
# Causes 300 trees to be trained:
rb <- rfArb(x, y, nTree = 300)
# Causes rows to be sampled without replacement:
rb <- rfArb(x, y, withRepl=FALSE)
# Causes validation census to report class probabilities:
rb <- rfArb(iris[-5], iris[5], ctgCensus="prob")
# Applies table-weighting to classification categories:
rb <- rfArb(iris[-5], iris[5], classWeight = "balance")
# Weights first category twice as heavily as remaining two:
rb <- rfArb(iris[-5], iris[5], classWeight = c(2.0, 1.0, 1.0))
# Does not split nodes when doing so yields less than a 2% gain in
# information over the parent node:
rb <- rfArb(x, y, minInfo=0.02)
# Does not split nodes representing fewer than 10 unique samples:
rb <- rfArb(x, y, minNode=10)
# Trains a maximum of 20 levels:
rb <- rfArb(x, y, nLevel = 20)
# Trains, but does not perform subsequent validation:
rb <- rfArb(x, y, noValidate=TRUE)
# Chooses 500 rows (with replacement) to root each tree.
rb <- rfArb(x, y, nSamp=500)
# Chooses 2 predictors as splitting candidates at each node (or
# fewer, when choices exhausted):
rb <- rfArb(x, y, predFixed = 2)
# Causes each predictor to be selected as a splitting candidate with
# distribution Bernoulli(0.3):
rb <- rfArb(x, y, predProb = 0.3)
# Causes first three predictors to be selected as splitting candidates
# twice as often as the other two:
rb <- rfArb(x, y, predWeight=c(2.0, 2.0, 2.0, 1.0, 1.0))
# Causes (default) quantiles to be computed at validation:
rb <- rfArb(x, y, quantiles=TRUE)
qPred <- rb$validation$qPred
# Causes specfied quantiles (deciles) to be computed at validation:
rb <- rfArb(x, y, quantVec = seq(0.1, 1.0, by = 0.10))
qPred <- rb$validation$qPred
# Constrains modelled response to be increasing with respect to X1
# and decreasing with respect to X5.
rb <- rfArb(x, y, regMono=c(1.0, 0, 0, 0, -1.0, 0))
# Causes rows to be sampled with random weighting:
rb <- rfArb(x, y, rowWeight=runif(nRow))
# Suppresses creation of detailed leaf information needed for
# quantile prediction and external tools.
rb <- rfArb(x, y, thinLeaves = TRUE)
# Directs prediction to take a random branch on encountering
# values not observed during training, such as NA or an
# unrecognized category.
predict(rb, trapUnobserved = FALSE)
# Directs prediction to silently trap unobserved values, reporting a
# score associated with the current nonterminal tree node.
predict(rb, trapUnobserved = TRUE)
# Sets splitting position for predictor 0 to far left and predictor
# 1 to far right, others to default (median) position.
spq <- rep(0.5, ncol(x))
spq[0] <- 0.0
spq[1] <- 1.0
rb <- rfArb(x, y, splitQuant = spq)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.