rf_classification_pipeline: Random Forest Classifcation pipeline

Description Usage Arguments

View source: R/RF_Utilities.R

Description

This function should not be run alone. You should use Run_RF_pipeline function to run the main pipeline for this package.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
rf_classification_pipeline(
  feature_table,
  classes,
  metric = "ROC",
  ntree = 999,
  nmtry = 7,
  sampling = NULL,
  nfolds = 5,
  ncrossrepeats = 10,
  pro = 0.8,
  SEED = 1995
)

Arguments

feature_table

The feature table that contains the information to be input into the random forest classifier. Note that this table should not include information about the classes that are being predicted.

classes

A vector that represents the classes that each sample (row) in the feature table represents. This can be coded as Case (level 1 factor) and control (level 2 factor). Make sure the factor levels are correct with using AUPRC or results will not always be correct.

metric

A string that indicates whether the pipeline should use AUROC or AUPRC. For AUROC set metric="ROC". For AUPRC set metric="PR". Defaults to "ROC".

ntree

An integer that represents the number of trees that you want to use during randoom forest construction. Defaults to 999.

nmtry

An integer representing the number of different mtry values that you want to test during cross validation. The values of mtry to test is calculated as follows: mtry <- round(seq(1, number_of_features/3, length=nmtry)). Defaults to 7.

sampling

A string indicating that type of sampling that should be done incase of inbalanced class designs. Options include: "up", "down" "SMOTE" and NULL.

nfolds

An integer that represents the number of folds to used during cross validation. Defaults to 5.

ncrossrepeats

An integer that represents the number of times to run cross validation on k folds. Defaults to 10.

pro

The proporition of samples that should be used for training versus testing during cross validation. Defaults to 0.8

SEED

The random seed used to split the samples during cross validation. Defaults to 1995.


nearinj/RandomForestUtils documentation built on July 30, 2020, 9:51 a.m.