compare_two_datasets: Compare two datasets

Description Usage Arguments

View source: R/compare_two_datasets.R

Description

Run a complete train and test using the provided features, both dataframes must have the same features. Used to compare two datasets with the same comparsion made for the insects, leaving one dataset out. The required column is the Class, and its values must be "E" or "NE". Reditect to an object to save the models, ROCs results and Pvalues. Plots are outputed to a file in the working dir.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
compare_two_datasets(
  set1 = GeneEssentiality::drosophila_features,
  set2 = GeneEssentiality::tribolium_features,
  CPU = 20,
  seeds = GeneEssentiality::seed,
  plot_prefix = "Compared_",
  set1.name = "Set1",
  set2.name = "Set2",
  XGBT = T,
  RF = T
)

Arguments

set1

Data.frame with the features of the first set. This will be used to train a model and as a test for the model trained from the second set. Default is the Drosophila melanogaster features

set2

Data.frame with the features of the second set. This will be used to train a model and as a test for the model trained from the first set. Default is the Tribolium castaneum features

CPU

Number of threads to use

seeds

List of vectors, 30 lists of vectors with 6 elements, and the last list with a single number for the final model

plot_prefix

Input a prefix for the filename of the plots. Default is "Insects_"

set1.name

Name to be used for the title of the plots for the first set. Default is "Set1"

set2.name

Name to be used for the title of the plots for the first set. Default is "Set2"

XGBT

Use eXtreme Gradient Boosting trees model #slow

RF

Use Randon Forest model


g1o/GeneEssentiality documentation built on Jan. 3, 2022, 1:21 a.m.