tune.rfsrc: Tune Random Forest for the optimal mtry and nodesize...
In randomForestSRC: Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)

tune.rfsrc

R Documentation

Tune Random Forest for the optimal mtry and nodesize parameters

Description

Finds the optimal mtry and nodesize tuning parameter for a random forest using out-of-sample error. Applies to all families.

Usage


## S3 method for class 'rfsrc'
tune(formula, data,
  mtryStart = ncol(data) / 2,
  nodesizeTry = c(1:9, seq(10, 100, by = 5)), ntreeTry = 100,
  sampsize = function(x){min(x * .632, max(150, x ^ (3/4)))},
  nsplit = 1, stepFactor = 1.25, improve = 1e-3, strikeout = 3, maxIter = 25,
  trace = FALSE, doBest = TRUE, ...)

## S3 method for class 'rfsrc'
tune.nodesize(formula, data,
  nodesizeTry = c(1:9, seq(10, 150, by = 5)), ntreeTry = 100,
  sampsize = function(x){min(x * .632, max(150, x ^ (4/5)))},
  nsplit = 1, trace = TRUE, ...)

Arguments

`formula`	A symbolic formula describing the model to be fit.
`data`	A data frame containing the response variable and predictor variables.
`mtryStart`	Initial value of `mtry` used to start the tuning search.
`nodesizeTry`	Vector of `nodesize` values over which tuning is performed.
`ntreeTry`	Number of trees used during the tuning step.
`sampsize`	Function specifying the size of the subsample. Can also be a numeric value.
`nsplit`	Number of random split points considered when splitting a node.
`stepFactor`	Multiplicative factor used to adjust `mtry` at each iteration.
`improve`	Minimum relative improvement in out-of-sample error required to continue the search.
`strikeout`	Number of consecutive non-improving steps (negative improvement) allowed before stopping the search. Increase to allow a more exhaustive search.
`maxIter`	Maximum number of iterations allowed for the `mtry` bisection search.
`trace`	If `TRUE`, prints progress during the search.
`doBest`	If `TRUE`, fits and returns a forest using the optimal `mtry` and `nodesize`.
`...`	Additional arguments passed to `rfsrc.fast`.

Details

tune returns a matrix with three columns: the first and second columns contain the nodesize and mtry values evaluated during the tuning process, and the third column contains the corresponding out-of-sample error.

The error is standardized. For multivariate forests, it is averaged over the outcomes; for competing risks, it is averaged over the event types.

If doBest = TRUE, the function also returns a forest object fit using the optimal mtry and nodesize values.

All tuning calculations, including the final optimized forest, are performed using the fast forest interface rfsrc.fast, which relies on subsampling. This makes the procedure computationally efficient but approximate. Users seeking more accurate tuning results may wish to adjust parameters such as:

Increasing sampsize, which controls the size of the subsample used for tuning.
Increasing ntreeTry, which defaults to 100 for speed.

It is also helpful to visualize the out-of-sample error surface as a function of mtry and nodesize using a contour plot (see example below) to identify regions of low error.

The function tune.nodesize performs a simplified search by optimizing only over nodesize.

Author(s)

Hemant Ishwaran and Udaya B. Kogalur

Examples


## ------------------------------------------------------------
## White wine classification example
## ------------------------------------------------------------

## load the data
data(wine, package = "randomForestSRC")
wine$quality <- factor(wine$quality)

## set the sample size manually
o <- tune(quality ~ ., wine, sampsize = 100)

## here is the optimized forest 
print(o$rf)

## visualize the nodesize/mtry OOB surface
if (library("interp", logical.return = TRUE)) {

  ## nice little wrapper for plotting results
  plot.tune <- function(o, linear = TRUE) {
    x <- o$results[,1]
    y <- o$results[,2]
    z <- o$results[,3]
    so <- interp(x=x, y=y, z=z, linear = linear)
    idx <- which.min(z)
    x0 <- x[idx]
    y0 <- y[idx]
    filled.contour(x = so$x,
                   y = so$y,
                   z = so$z,
                   xlim = range(so$x, finite = TRUE) + c(-2, 2),
                   ylim = range(so$y, finite = TRUE) + c(-2, 2),
                   color.palette =
                     colorRampPalette(c("yellow", "red")),
                   xlab = "nodesize",
                   ylab = "mtry",
                   main = "error rate for nodesize and mtry",
                   key.title = title(main = "OOB error", cex.main = 1),
                   plot.axes = {axis(1);axis(2);points(x0,y0,pch="x",cex=1,font=2);
                                points(x,y,pch=16,cex=.25)})
  }

  ## plot the surface
  plot.tune(o)

}

## ------------------------------------------------------------
## tuning for class imbalanced data problem
## - see imbalanced function for details
## - use rfq and perf.type = "gmean" 
## ------------------------------------------------------------

data(breast, package = "randomForestSRC")
breast <- na.omit(breast)
o <- tune(status ~ ., data = breast, rfq = TRUE, perf.type = "gmean")
print(o)


## ------------------------------------------------------------
## tune nodesize for competing risk - wihs data 
## ------------------------------------------------------------

data(wihs, package = "randomForestSRC")
plot(tune.nodesize(Surv(time, status) ~ ., wihs, trace = TRUE)$err)

randomForestSRC documentation built on June 8, 2025, 1:12 p.m.

randomForestSRC index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

randomForestSRC
Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)

tune.rfsrc: Tune Random Forest for the optimal mtry and nodesize...
In randomForestSRC: Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)

Tune Random Forest for the optimal mtry and nodesize parameters

Description

Usage

Arguments

Details

Author(s)

See Also

Examples

Related to tune.rfsrc in randomForestSRC...

R Package Documentation

Browse R Packages

We want your feedback!

randomForestSRC Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)

tune.rfsrc: Tune Random Forest for the optimal mtry and nodesize... In randomForestSRC: Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)

Tune Random Forest for the optimal mtry and nodesize parameters

Description

Usage

Arguments

Details

Author(s)

See Also

Examples

Related to tune.rfsrc in randomForestSRC...

R Package Documentation

Browse R Packages

We want your feedback!

randomForestSRC
Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)

tune.rfsrc: Tune Random Forest for the optimal mtry and nodesize...
In randomForestSRC: Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)