trainClassifier: Train Classifier for MDT

Description Usage Arguments Value See Also

Description

Trains a classifier that predicts phenotypic response for MDT objects.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
trainClassifier(mdt, id, mtable_vars = "all", phen_vars = "none",
  fill = NA, partitions = caret::createMultiFolds(y = response(mdt), k =
  10L, times = 1L), preprocess = function(x, y) function(x) x,
  method = "rf", permute = FALSE, validation = TRUE, parallel = FALSE,
  verbose = TRUE, .export = NULL, ...)

## S4 method for signature 'MDT'
trainClassifier(mdt, id, mtable_vars = "all",
  phen_vars = "none", fill = NA, partitions = caret::createMultiFolds(y =
  response(mdt), k = 10L, times = 1L), preprocess = function(x, y) function(x)
  x, method = "rf", permute = FALSE, validation = TRUE,
  parallel = FALSE, verbose = TRUE, .export = NULL, ...)

Arguments

mdt

MDT object. FeatureIDs and SampleIDs will correspond between the MDT input and the MLGWAS output object.

id

Identification name of classifier: ClassifierID. By default, uses the name of the method appended with '.permuted' if applicable.

mtable_vars

Variables in mtable to be included in matrix. Choose between 'none', 'all' or a subset of values in features(x).

phen_vars

Variables in phenotype to be included in matrix. Choose between 'none', 'all' or a subset of values in colnames(phenotype(x)). SampleID and Response are excluded.

fill

How should missing values in MDT be handled. By default, does nothing: leaves then as NA.

partitions

Some function that takes a MDT object as input and returns a list of integers specifying the training set indexes. See createDataPartition for such functions. Names of list will define the PartitionID.

preprocess

Some function that takes the training matrix, with features as columns and samples as rows, and the response (response(mdt)) as input and returns a function that modifies a a matrix with the same number of columns (features). Can be used for feature selection or preprocessing. The function will be created internally only using the training partition, thus allowing to apply the feature selection to the test set without using any of its information.

method

a string specifying which classification or regression model to use. Possible values are found using names(getModelInfo()). See http://topepo.github.io/caret/bytag.html. A list of functions can also be passed for a custom model function. See http://topepo.github.io/caret/custom_models.html for details.

permute

logical specifying if MDT should be permuted. Distribution of performance scores will correspond to a null distribution and can be used to test the significance of the a classifier.

validation

logical specifying if test set should be data samples not in partition.

parallel

logical specifying if parallel interface should be used. Depends on foreach. See https://cran.r-project.org/web/packages/doParallel/vignettes/gettingstartedParallel.pdf

verbose

logical specifying if function should be run in verbose mode.

.export

character vector of variables to export. This can be useful when accessing a variable that isn't defined in the current environment. The default value in NULL.

...

Further arguments to pass on to train.

Value

MLGWAS object.

See Also

trainClassifiers train createDataPartition


olivmrtn/MachineLearningGWAS documentation built on May 24, 2019, 12:52 p.m.