rf_standard | R Documentation |
Convenient wrapper function that runs Random Forest pipeline. This pipeline include 5-fold cross-validation and hyper-parameter tuning for mtry and ntree. Number of train/test splits and train/validation ratios are customizable by user.
rf_standard(
rf.type,
vpred,
df,
dir,
nsets = 10,
split.param = c(train.ratio = 0.8),
mtry = NA,
ntree = (1:10) * 500,
top.variables = NA,
rf.param = c(dataframe.name, predictors.name),
varimp.param = c(selection_type = NA, metric = NA, xlab = NA),
density.param = c(scale = 0.8, ncol = 2, tsize = 10, xlab = NA),
extract.names.df = NA,
experiment.note = NA
)
rf.type |
One of: |
vpred |
String. Name of response variable. |
df |
Input dataframe. |
dir |
Directory path for export. String. |
nsets |
(Optional) Number of train/validation sets to generate, Default: 10 |
split.param |
(Optional) Proportion of sample assigned to training set. Range from 0 to 1. For example 0.8 indicates 80% of samples assigned to training set for a 80:20 train:test split, Default: c(train.ratio = 0.8). |
mtry |
(Optional) Range of mtry values to try for Random Forest hyperparameter tuning. If NA, will use mtry.guide to select optimal mtry based on tree type and number of predictor variables, Default: NA. |
ntree |
(Optional) Number of trees to try during Random Forest hyperparameter tuning, Default: (1:10) * 500. |
top.variables |
(Optional) Number of top important predictor variables to plot, Default: NA. |
rf.param |
Dataframe object as a string. Predictor variables name (any description) as a string. |
varimp.param |
(Optional) Parameters to pass to variable importance plot. Default: c(selection_type = NA, metric = NA, xlab = NA). |
density.param |
(Optional) Parameters to pass to density plot, Default: c(scale = 0.8, ncol = 2, tsize = 10, xlab = NA). |
extract.names.df |
(Optional) Dataframe. A df must be provided if |
experiment.note |
(Optional) User input human-readable note that will be sent to output log. May be used to log why/what is being run., Default: NA. |
DETAILS
Exports random forest results to sub-folders within dir
rf_standard(rf.type = "class", vpred = "Genotype", df = blood %>% select(-c(Sex, AnimalID)), dir = "/Users/Documents/experiment", experiment.note = "Predict mouse genotype from immune populations. No genotype excluded from dataframe. Exclude sex metadata.", rf.param = c(dataframe.name = "blood-allgenotypes", predictors.name = "immune"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.