power_ranger: Cascade Forest's implementation in R

Description Usage Arguments Details Author(s) Examples

Description

This function attempts to replicate Cascade Forest in original way of paper by package ranger.

Usage

1
2
3
4
5
power_ranger(x = NULL, y, training_frame, validation_frame = NULL,
  num.trees = 200, pmtry = NULL, n_forest = 4, random_forest = 2,
  eta = FALSE, num.threads = 30, work.dir = getwd(), early.stop = 10,
  continue = 0, write.forest = TRUE, save.memory = FALSE, id = NULL,
  k = NULL)

Arguments

x

A vector containing the names or indices of the predictor variables to use in building the model. If x is missing,then all columns except y are used.

y

The name of the response variable in the model.If the data does not contain a header, this is the column index number starting at 0, and increasing from left to right. (The response must be either an integer or a categorical variable).

training_frame

Training data of class data.frame or matrix.

validation_frame

Validation data.

num.trees

Number of trees.

pmtry

Percentage of variables to possibly split at in each node. Default is the (rounded down) square root of the number variables divided by total numbers.

n_forest

Total forest number for every layer.

random_forest

Number of Random forest.

num.threads

Number of threads.

work.dir

Type: character. When out.of.memory == TRUE, the four models of each layers will be saved to disk in file df1:early.stop.RData. If you don't provide a working directory, the models will be saved inside that directory.

early.stop

Number of layers.

continue

It's used for prediction if n_forest==continue. Else if n_forest>continue for add layers to Cascade Forest.

write.forest

Save ranger.forest object, required for prediction. Set to FALSE to reduce memory usage if no prediction intended.

save.memory

Use memory saving (but slower) splitting mode. Warning: This option slows down the tree growing, use only if you encounter memory problems.

id

prefix of saved files

k

number of k-folds, if is NULL (default) use OOB

Details

For implementation details of Cascade Forest: https://arxiv.org/pdf/1702.08835.pdf.

Author(s)

Blanda Alessandro

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
rm(list=ls())
# Load libraries
library(devtools)
install_github( 'ablanda/deepForest ')
library(deepForest)
# Download MNIST data here: \url{https://pjreddie.com/projects/mnist-in-csv/}
dati<-read.csv('mnist_train.csv',header=F)
dativ<-read.csv('mnist_test.csv',header=F)

dati[,1]<-as.factor(dati[,1])
dativ[,1]<-as.factor(dativ[,1])
m<-power_ranger(y=1,training_frame = dati[1:100,],validation_frame = dativ[1:100,],n_forest=8,random_forest = 4,early.stop=4,k=3)
pred<-matrix(0,nrow(dativ),nlevels(dati[,1])-1)
for(h in 1:(nlevels(dati[,1])-1)){
pred_level<-sapply(1:8,function(j) m$pred_val[[1]][[j]][,h])
 pred[,h] <-rowMeans(pred_level)}

## End(Not run)

ablanda/deepForest documentation built on May 28, 2019, 3:22 p.m.