Random_Gene_Pipeline: Random_gene_pipeline
In nearinj/RandomForestUtils: A Pipeline for Running Cross Validated Random Forest Models

Description Usage Arguments Value

View source: R/RF_Utilities.R

Random_gene_pipeline

Random_Gene_Pipeline(
  feature_table,
  classes,
  metric = "ROC",
  sampling = NULL,
  repeats = 10,
  path,
  nmtry = 6,
  ntree = 1001,
  nfolds = 3,
  ncrossrepeats = 10,
  pro = 0.8,
  list_of_seeds,
  list_of_random_gene_seeds
)

`feature_table`	The feature table that contains the information to be input into the random forest classifier. Note that this table should not include information about the classes that are being predicted.
`classes`	A vector that represents the classes that each sample (row) in the feature table represents. This can be coded as Case (level 1 factor) and control (level 2 factor). Make sure the factor levels are correct with using AUPRC or results will not always be correct.
`metric`	A string that indicates whether the pipeline should use AUROC or AUPRC. For AUROC set metric="ROC". For AUPRC set metric="PR". Defaults to "ROC".
`sampling`	A string indicating that type of sampling that should be done incase of inbalanced class designs. Options include: "up", "down" "SMOTE" and NULL.
`repeats`	The number of times data should be split into testing and cross-validation datasets.
`path`	A string representing the PATH were output files should be saved.
`nmtry`	An integer representing the number of different mtry values that you want to test during cross validation. The values of mtry to test is calculated as follows: mtry <- round(seq(1, number_of_features/3, length=nmtry)). Defaults to 7.
`ntree`	An integer that represents the number of trees that you want to use during randoom forest construction. Defaults to 1001.
`nfolds`	An integer that represents the number of folds to used during cross validation. Defaults to 3.
`ncrossrepeats`	An integer that represents the number of times to run cross validation on k folds. Defaults to 10.
`pro`	The proporition of samples that should be used for training versus testing during cross validation. Defaults to 0.8
`list_of_seeds`	A vector containing a number of seeds that should be equal to the number of repeats.
`list_of_random_gene_seeds`	A matric containg rows that correspond to the column # of the gene you want included for each repeat
`SEED`	The random seed used to split the samples during cross validation. Defaults to 1995.

This function returns a list with the following characteristics: "Object[[1]] contains all the median cross validation AUCS from each data split using the best mtry value" "Object[[2]] contains all the test AUC values from each data split" "Object[[3]] contains all the tested mtry values and the median ROC for each from each data split" "Object[[4]] contains the list of important features from the best model selected from each data split" "Object[[5]] contains each caret random forest model from each data split" "This function will also write a csv with cross validation AUCS and test AUCS, to the given path as well as an RDS file that contains the resulting object from this function"

nearinj/RandomForestUtils documentation built on July 30, 2020, 9:51 a.m.

nearinj/RandomForestUtils index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

nearinj/RandomForestUtils
A Pipeline for Running Cross Validated Random Forest Models

Random_Gene_Pipeline: Random_gene_pipeline
In nearinj/RandomForestUtils: A Pipeline for Running Cross Validated Random Forest Models

Description

Usage

Arguments

Value

Related to Random_Gene_Pipeline in nearinj/RandomForestUtils...

R Package Documentation

Browse R Packages

We want your feedback!

nearinj/RandomForestUtils A Pipeline for Running Cross Validated Random Forest Models

Random_Gene_Pipeline: Random_gene_pipeline In nearinj/RandomForestUtils: A Pipeline for Running Cross Validated Random Forest Models

Description

Usage

Arguments

Value

Related to Random_Gene_Pipeline in nearinj/RandomForestUtils...

R Package Documentation

Browse R Packages

We want your feedback!

nearinj/RandomForestUtils
A Pipeline for Running Cross Validated Random Forest Models

Random_Gene_Pipeline: Random_gene_pipeline
In nearinj/RandomForestUtils: A Pipeline for Running Cross Validated Random Forest Models