rndom2: Random forest regression model 2
In fabregithub/r4jecs: Collection of Functions for Japan Environment and Children's Study

rndom2

R Documentation

Random forest regression model 2

Description

The function conducts random forest with the recursive feature elimination algorithm to fit the data and perform 10-fold cross-validation.

Usage

rndom2(data, dp, Formula, N)

Arguments

`data`	Training data
`dp`	Dependent variable
`Formula`	Defining the model to fit
`N`	Number of trees

Details

The random forest algorithm is a collection of classification and regression trees that grow from each bootstrap dataset drew randomly from the original data. In each tree, variables are randomly selected for splitting at each node and the best split among those variables are chosen. The algorithm aggregates predictions from trees by majority voting for classification or averaging for regression, and calculate an out-of-bag (OOB, the data not drew in the bootstrap samples) error rate (Liaw and Wiener, 2002).

The rndom2 function combines the random forest (a fast Implementation of random forests for high dimensional data by ranger package, Wright and Ziegler, 2017) with the recursive feature elimination (RFE) algorithm to deal with the collinearity problem and remove less relevant predictors (Gregorutti et al., 2013). The RFE algorithm consisted of: (1) training the random forest, and (2) calculating the permutation importance scores of variables and coefficient of determination (R^2) by OOB data, and (3) removing the less important variable. Step (1) to (3) were repeated until no further variable remained. The random forest with the RFE algorithm is first trained by including all predictors and set the number of predictors that randomly sampled for splitting at each node (mtry) as one third of the total number of predictors. The RFE algorithm iteratively removes predictors and calculates corresponding OOB R^2. The appropriate set of predictors was determined based on the highest OOB R^2. Once the set of predictors is determined, the mtry will be tuned from 1 to the total number of predictors and the mtry with the highest OOB R^2 will be selected as the final model. Lastly the rndom2 function validates the model performance by using 10-fold cross validation, outputs a dataset combines observed values and predicted values.

Value

The function returns a list includes

`$best_vn`	Selected variable through the recursive feature elimination algorithm
`$best_mtry`	mtry with the highest OOB `R^2`
`$rf.af`	Saved random forest model
`$rp`	Relative importance (%) derived from permutation importance
`$cv.r1`	Dataset combined observed values and predicted values derived from the 10-fold cross-validation.

Author(s)

Jung Chau-Ren

References

Liaw, A., Wiener, M., 2002. Classification and Regression by randomForest. R News 2/3, 18–22.

Wright, M.N., Ziegler, A., 2017. ranger: A fast implementation of random forest for high dimensional data in C++ and R. Journal of Statistical Software 77, 1–17.

Gregorutti, B., Michel, B., Saint-Pierre, P., 2013. Correlation and variable importance in random forests. Statistics and Computing 27, 1–31. DOI: 10.1007/s11222-016-9646-1

fabregithub/r4jecs documentation built on June 13, 2025, 4:50 p.m.

fabregithub/r4jecs index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

fabregithub/r4jecs
Collection of Functions for Japan Environment and Children's Study

rndom2: Random forest regression model 2
In fabregithub/r4jecs: Collection of Functions for Japan Environment and Children's Study

Random forest regression model 2

Description

Usage

Arguments

Details

Value

Author(s)

References

Related to rndom2 in fabregithub/r4jecs...

R Package Documentation

Browse R Packages

We want your feedback!

fabregithub/r4jecs Collection of Functions for Japan Environment and Children's Study

rndom2: Random forest regression model 2 In fabregithub/r4jecs: Collection of Functions for Japan Environment and Children's Study

Random forest regression model 2

Description

Usage

Arguments

Details

Value

Author(s)

References

Related to rndom2 in fabregithub/r4jecs...

R Package Documentation

Browse R Packages

We want your feedback!

fabregithub/r4jecs
Collection of Functions for Japan Environment and Children's Study

rndom2: Random forest regression model 2
In fabregithub/r4jecs: Collection of Functions for Japan Environment and Children's Study