Run_RF_Regression_Pipeline: Run_RF_Regression_Pipeline

Description Usage Arguments Value

View source: R/RF_Utilities.R

Description

Run_RF_Regression_Pipeline

Usage

1
2
3
4
5
6
7
8
Run_RF_Regression_Pipeline(
  feature_table,
  actual,
  sampling = NULL,
  repeats,
  path,
  list_of_seeds
)

Arguments

feature_table

A feature table containing the samples (rows) and the features (columns) to run random forest regression on. Note that this table should not include the value that is trying to be predicted

actual

A vector containing the actual values for the value that this trying to be predectied.

sampling

The sampling technique to use during cross validation. Defaults to NULL.

repeats

The number of data splits that should occur between testing data and cross validation data.

path

The path that the output should be saved to.

list_of_seeds

A list of seeds equal to the length of repeats that is used for each random data split.

Value

Returns a list containing the following: "Object[[1]] contains all the median cross validation RMSE from each data split using the best mtry value" "Object[[2]] contains all the test RMSE values from each data split" "Object[[3]] contains all the tested mtry values and the median RMSE from each from each data split" "Object[[4]] contains the list of important features from the best model selected from each data split" "Object[[5]] contains each caret random forest model from each data split" "This function will also write a csv with cross validation RMSE and test RMSE, to the given path as well as an RDS file that contains the resulting object from this function"


nearinj/RandomForestUtils documentation built on July 30, 2020, 9:51 a.m.