find.best.number.of.trees: Using the classification error rate for each number of trees,...

Description Usage Arguments Details Value See Also Examples

View source: R/classification.R

Description

Find a plateau that corresponds with the minimum error. Uses a sliding window approach where the window has a width of 3 trees.

Usage

1

Arguments

error.oob

A vector of numbers. Should be the $err.rate from a randomForest::randomForest object.

Details

Select windows with lowest mean. From these windows, I select the windows with lowest standard deviation (indicates plateau). If multiple plateaus exist, select the one with the fewest number of trees. Then select the tree corresponding to the center of the window as the optimal number of trees.

Value

A numerical value specifying the optimal number of trees to use in random forest.

See Also

Other Classification functions: CVPredictionsRandomForest(), CVRandomForestClassificationMatrixForPheatmap(), GenerateExampleDataMachinelearnr(), LOOCVPredictionsRandomForestAutomaticMtryAndNtree(), LOOCVRandomForestClassificationMatrixForPheatmap(), RandomForestAutomaticMtryAndNtree(), RandomForestClassificationGiniMatrixForPheatmap(), RandomForestClassificationPercentileMatrixForPheatmap(), eval.classification.results()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
id = c("1a", "1b", "1c", "1d", "1e", "1f", "1g", "2a", "2b", "2c", "2d", "2e",
       "2f", "3a",
       "3b", "3c", "3d", "3e", "3f", "3g", "3h", "3i")

x = c(18, 21, 22, 24, 26, 26, 27, 30, 31, 35, 39, 35, 30, 40, 41, 42, 44, 46,
47, 48, 49, 54)

y = c(10, 11, 22, 15, 12, 13, 14, 33, 39, 37, 44, 40, 45, 27, 29, 20, 28, 21,
30, 31, 23, 24)

a = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

b = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)


actual = as.factor(c("1", "1", "1", "1", "1", "1", "1", "2", "2", "2", "2",
"2", "2", "3", "3", "3",
       "3", "3", "3", "3", "3", "3"))

example.data <- data.frame(id, x, y, a, b, actual)

set.seed(1)
rf.result <- randomForest::randomForest(x=example.data[,c("x", "y", "a", "b")],
y=example.data[,"actual"], proximity=TRUE, ntree=50)

error.oob <- rf.result[[4]][,1]

best.tree <- find.best.number.of.trees(error.oob)

trees <- 1:length(error.oob)

plot(trees, error.oob, type = "l")

#dev.new()
plot(example.data$x, example.data$y)
text(example.data$x, example.data$y,labels=example.data$id)

yhhc2/machinelearnr documentation built on Dec. 23, 2021, 7:19 p.m.