crossValidate | R Documentation |
This function has been designed to facilitate the comparison of classification
methods using cross-validation, particularly when there are multiple assays per biological unit.
A selection of typical comparisons are implemented. The train
function
is a convenience method for training on one data set and likewise predict
for predicting on an
independent validation data set.
## S4 method for signature 'DataFrame'
crossValidate(
measurements,
outcome,
nFeatures = 20,
selectionMethod = "auto",
selectionOptimisation = "Resubstitution",
performanceType = "auto",
classifier = "auto",
multiViewMethod = "none",
assayCombinations = "all",
nFolds = 5,
nRepeats = 20,
nCores = 1,
characteristicsLabel = NULL,
extraParams = NULL,
verbose = 0
)
## S4 method for signature 'MultiAssayExperimentOrList'
crossValidate(
measurements,
outcome,
nFeatures = 20,
selectionMethod = "auto",
selectionOptimisation = "Resubstitution",
performanceType = "auto",
classifier = "auto",
multiViewMethod = "none",
assayCombinations = "all",
nFolds = 5,
nRepeats = 20,
nCores = 1,
characteristicsLabel = NULL,
extraParams = NULL,
verbose = 0
)
## S4 method for signature 'data.frame'
crossValidate(
measurements,
outcome,
nFeatures = 20,
selectionMethod = "auto",
selectionOptimisation = "Resubstitution",
performanceType = "auto",
classifier = "auto",
multiViewMethod = "none",
assayCombinations = "all",
nFolds = 5,
nRepeats = 20,
nCores = 1,
characteristicsLabel = NULL,
extraParams = NULL,
verbose = 0
)
## S4 method for signature 'matrix'
crossValidate(
measurements,
outcome,
nFeatures = 20,
selectionMethod = "auto",
selectionOptimisation = "Resubstitution",
performanceType = "auto",
classifier = "auto",
multiViewMethod = "none",
assayCombinations = "all",
nFolds = 5,
nRepeats = 20,
nCores = 1,
characteristicsLabel = NULL,
extraParams = NULL,
verbose = 0
)
## S3 method for class 'matrix'
train(x, outcomeTrain, ...)
## S3 method for class 'data.frame'
train(x, outcomeTrain, ...)
## S3 method for class 'DataFrame'
train(
x,
outcomeTrain,
selectionMethod = "auto",
nFeatures = 20,
classifier = "auto",
performanceType = "auto",
multiViewMethod = "none",
assayIDs = "all",
extraParams = NULL,
verbose = 0,
...
)
## S3 method for class 'list'
train(x, outcomeTrain, ...)
## S3 method for class 'MultiAssayExperiment'
train(x, outcome, ...)
## S3 method for class 'trainedByClassifyR'
predict(object, newData, outcome, ...)
measurements |
Either a |
outcome |
A vector of class labels of class |
... |
For |
nFeatures |
The number of features to be used for classification. If this is a single number, the same number of features will be used for all comparisons
or assays. If a numeric vector these will be optimised over using |
selectionMethod |
Default: |
selectionOptimisation |
A character of "Resubstitution", "Nested CV" or "none" specifying the approach used to optimise |
performanceType |
Performance metric to optimise if classifier has any tuning parameters. |
classifier |
Default: |
multiViewMethod |
Default: |
assayCombinations |
A character vector or list of character vectors proposing the assays or, in the case of a list, combination of assays to use
with each element being a vector of assays to combine. Special value |
nFolds |
A numeric specifying the number of folds to use for cross-validation. |
nRepeats |
A numeric specifying the the number of repeats or permutations to use for cross-validation. |
nCores |
A numeric specifying the number of cores used if the user wants to use parallelisation. |
characteristicsLabel |
A character specifying an additional label for the cross-validation run. |
extraParams |
A list of parameters that will be used to overwrite default settings of transformation, selection, or model-building functions or
parameters which will be passed into the data cleaning function. The names of the list must be one of |
verbose |
Default: 0. A number between 0 and 3 for the amount of progress messages to give. A higher number will produce more messages as more lower-level functions print messages. |
x |
Same as |
outcomeTrain |
For the |
assayIDs |
A character vector for assays to train with. Special value |
object |
A fitted model or a list of such models. |
newData |
For the |
classifier
can be any a keyword for any of the implemented approaches as shown by available()
.
selectionMethod
can be a keyword for any of the implemented approaches as shown by available("selectionMethod")
.
multiViewMethod
can be a keyword for any of the implemented approaches as shown by available("multiViewMethod")
.
An object of class ClassifyResult
data(asthma)
# Compare randomForest and SVM classifiers.
result <- crossValidate(measurements, classes, classifier = c("randomForest", "SVM"))
performancePlot(result)
# Compare performance of different assays.
# First make a toy example assay with multiple data types. We'll randomly assign different features to be clinical, gene or protein.
# set.seed(51773)
# measurements <- DataFrame(measurements, check.names = FALSE)
# mcols(measurements)$assay <- c(rep("clinical",20),sample(c("gene", "protein"), ncol(measurements)-20, replace = TRUE))
# mcols(measurements)$feature <- colnames(measurements)
# We'll use different nFeatures for each assay. We'll also use repeated cross-validation with 5 repeats for speed in the example.
# set.seed(51773)
#result <- crossValidate(measurements, classes, nFeatures = c(clinical = 5, gene = 20, protein = 30), classifier = "randomForest", nRepeats = 5)
# performancePlot(result)
# Merge different assays. But we will only do this for two combinations. If assayCombinations is not specified it would attempt all combinations.
# set.seed(51773)
# resultMerge <- crossValidate(measurements, classes, assayCombinations = list(c("clinical", "protein"), c("clinical", "gene")), multiViewMethod = "merge", nRepeats = 5)
# performancePlot(resultMerge)
# performancePlot(c(result, resultMerge))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.