rfMod: Launch a RandomForest with cross-validation and the...

Description Usage Arguments Value Examples

Description

rfMod uses the randomForest function of the package randomForest. rfMod allows to launch a random forest for classification and regression while choosing a column of cross-validation and specifying a grid of hyperparameters.

Usage

1
2
3
4
rfMod(x, y, cvcol, ntree = 500, mtry = if (!is.null(y) && !is.factor(y))
  max(floor(ncol(x)/3), 1) else floor(sqrt(ncol(x))), maxnodes = NULL,
  nodesize = if (!is.null(y) && !is.factor(y)) 5 else 1, criterion = "RMSE",
  nbcore = NULL)

Arguments

x

data.frame, Predictor variables.

y

vector, Response variable.

cvcol

vector, Column with cross-validation fold index assignment per observation.

ntree

numeric, Number of trees to grow. Default is 50.

mtry

numeric, Number of variables randomly sampled as candidates at each split. Note that the default values are different for classification (sqrt(p) where p is number of variables in x) and regression (p/3).

maxnodes

numeric, Maximum number of terminal nodes trees in the forest can have. If not given, trees are grown to the maximum possible (subject to limits by nodesize).

nodesize

numeric, Minimum size of terminal nodes. Setting this number larger causes smaller trees to be grown (and thus take less time). Default is 5.

criterion

character, Criterion used to select the best model among the grid of hyperparameters.It can be : "RMSE", "R2", "MAPE" or "AUC".

Value

A list containing :

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
data(mtcars)

#Creation of cross-validation column :
set.seed(1234)
cv <- sample(1:8, nrow(mtcars), replace = TRUE)

#Data
y <- "mpg"
ycolumnindex <- names(mtcars) == "mpg"
x <- mtcars[, !ycolumnindex]
y <- mtcars[, ycolumnindex]

rfMod(x = x, y = y, cvcol= cv,
 ntree= c(50, 100), mtry = c(3,4),
  nodesize = c(3, 4, 5),  criterion = "RMSE")

anaislaot/optiPlus documentation built on May 23, 2019, 6:03 a.m.