Description Usage Arguments Details Value Author(s) References Examples
An alternative implementation to the previously published easy-hard classifier that doesn't do nested cross-validation for speed. In the first stage, each numeric variable is split on all possible midpoints between consecutive ordered values and the samples below the split and above the split are checked to see if they mostly belong to one class. Categorical varaibles are tabulated on factor levels and the count of samples in each class is determined. If any partitions of samples are pure for a class, based on a purity threshold, prediction rules are created. The samples not classified by any rule or classified to two or more classes the same number of times are left to be trained by the hard classifier.
1 2 3 4 5 6 7 8 9 | ## S4 method for signature 'MultiAssayExperiment'
easyHardClassifierTrain(measurements, easyDatasetID = "clinical", hardDatasetID = names(measurements)[1],
featureSets = NULL, metaFeatures = NULL, minimumOverlapPercent = 80,
datasetName = NULL, classificationName = "Easy-Hard Classifier",
easyClassifierParams = list(minCardinality = 10, minPurity = 0.9),
hardClassifierParams = list(SelectParams(), TrainParams(), PredictParams()),
verbose = 3)
## S4 method for signature 'EasyHardClassifier,MultiAssayExperiment'
easyHardClassifierPredict(model, test, predictParams, verbose = 3)
|
measurements |
A |
.
easyDatasetID |
The name of a data set in |
hardDatasetID |
The name of a data set in |
featureSets |
An object of type |
metaFeatures |
Either |
minimumOverlapPercent |
If |
datasetName |
A name associated with the data set used. |
classificationName |
A name associated with the classification. |
easyClassifierParams |
A list of length 2 with names "minCardinality" and "minPurity". The first parameter specifies what the minimum number of samples after a split has to be and the second specifies the minimum proportion of samples in a partition belonging to a particular class. |
hardClassifierParams |
A list of objects defining the classification to do on the samples which were not predicted
by the easy classifier Objects must of of class |
model |
A trained |
test |
A |
predictParams |
An object of class |
verbose |
Default: 3. A number between 0 and 3 for the amount of progress messages to give. This function only prints progress messages if the value is 3. |
The easy classifier may be NULL
if there are no rules that predicted the sample well using the easy data set. The hard classifier may be NULL
if all of the samples could be predicted with rules generated using the easy data set or it will simply be a character if all or almost all of the remaining samples belong to one class.
For EasyHardClassifierTrain
, the trained two-stage classifier. For EasyHardClassifierPredict
,
a factor vector of predicted classes.
Dario Strbenac
Inspired by: Stepwise Classification of Cancer Samples Using Clinical and Molecular Data, Askar Obulkasim, Gerrit Meijer and Mark van de Wiel 2011, BMC Bioinformatics, Volume 12 article 422, https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-422.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | genesMatrix <- matrix(c(rnorm(90, 9, 1),
9.5, 9.4, 5.2, 5.3, 5.4, 9.4, 9.6, 9.9, 9.1, 9.8),
ncol = 10, byrow = TRUE)
colnames(genesMatrix) <- paste("Sample", 1:10)
rownames(genesMatrix) <- paste("Gene", 1:10)
genders <- factor(c("Male", "Male", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female"))
# Scenario: Male gender can predict the hard-to-classify Sample 1 and Sample 2.
clinical <- DataFrame(age = c(31, 34, 32, 39, 33, 38, 34, 37, 35, 36),
gender = genders,
class = factor(rep(c("Poor", "Good"), each = 5)),
row.names = colnames(genesMatrix))
dataset <- MultiAssayExperiment(ExperimentList(RNA = genesMatrix), clinical)
selParams <- SelectParams(featureSelection = differentMeansSelection, selectionName = "Difference in Means",
resubstituteParams = ResubstituteParams(1:10, "balanced error", "lower"))
trained <- easyHardClassifierTrain(dataset, easyClassifierParams = list(minCardinality = 2, minPurity = 0.9),
hardClassifierParams = list(selParams, TrainParams(), PredictParams()))
predictions <- easyHardClassifierPredict(trained, dataset, PredictParams())
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.