tune: Tune random forest parameters

tuneR Documentation

Tune random forest parameters

Description

Tune the mtry and ntree random forest parameters using a grid search approach.

Usage

tune(
  x,
  cls = "class",
  mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) +
    mtry(x, cls = cls)/2, length.out = 4)),
  ntree_range = 1000,
  seed = 1234
)

## S4 method for signature 'AnalysisData'
tune(
  x,
  cls = "class",
  mtry_range = floor(seq(mtry(x, cls = cls) - mtry(x, cls = cls)/2, mtry(x, cls = cls) +
    mtry(x, cls = cls)/2, length.out = 4)),
  ntree_range = 1000,
  seed = 1234
)

Arguments

x

S4 object of class AnalysisData

cls

sample information column to use

mtry_range

numeric vector of mtry values to search

ntree_range

numeric vector of ntree values to search

seed

random number seed

Details

Parameter tuning is performed by grid search of all combinations of the mtry_range and ntree_range vectors provided. The optimal parameter values are selected using the out-of-bag error estimates of the margin metric for classification and the rmse (root-mean-square error) metric for regression.

Value

A list containing the optimal mtry and ntree parameters. This is suitable for use as the rf argument in method randomForest().

Examples

library(metaboData)

## Prepare some data
x <- analysisData(abr1$neg[,200:300],abr1$fact) %>%
  occupancyMaximum(cls = 'day') %>%
  transformTICnorm()

## Tune the `mtry` parameter for the `day` response
tune(x,cls = 'day')

jasenfinch/metabolyseR documentation built on Sept. 18, 2023, 1:25 a.m.