differentMeansSelection: Selection of Differentially Abundant Features

Description Usage Arguments Details Value Author(s) Examples

Description

Uses an ordinary t-test if the data set has two classes or one-way ANOVA if the data set has three or more classes to select differentially expressed features.

Usage

1
2
3
4
5
6
7
8
  ## S4 method for signature 'matrix'
differentMeansSelection(measurements, classes, ...)
  ## S4 method for signature 'DataFrame'
differentMeansSelection(measurements, classes, datasetName,
               trainParams, predictParams, resubstituteParams,
               selectionName = "Difference in Means", verbose = 3)
  ## S4 method for signature 'MultiAssayExperiment'
differentMeansSelection(measurements, targets = NULL, ...)

Arguments

measurements

Either a matrix, DataFrame or MultiAssayExperiment containing the training data. For a matrix, the rows are features, and the columns are samples.

classes

A vector of class labels of class factor of the same length as the number of samples in measurements. Not used if measurements is a MultiAssayExperiment object.

targets

Names of data tables to be combined into a single table and used in the analysis.

...

Variables not used by the matrix nor the MultiAssayExperiment method which are passed into and used by the DataFrame method.

datasetName

A name for the data set used. Stored in the result.

trainParams

A container of class TrainParams describing the classifier to use for training.

predictParams

A container of class PredictParams describing how prediction is to be done.

resubstituteParams

An object of class ResubstituteParams describing the performance measure to consider and the numbers of top features to try for resubstitution classification.

selectionName

A name to identify this selection method by. Stored in the result.

verbose

Default: 3. A number between 0 and 3 for the amount of progress messages to give. This function only prints progress messages if the value is 3.

Details

This selection method looks for changes in means and uses rowttests to rank the features if there are two classes or rowFtests if there are three or more classes. The choice of features is based on the best resubstitution performance.

Value

An object of class SelectResult or a list of such objects, if the classifier which was used for determining the specified performance metric made a number of prediction varieties.

Author(s)

Dario Strbenac

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
  #if(require(sparsediscrim))
  #{
    # Genes 76 to 100 have differential expression.
    genesMatrix <- sapply(1:25, function(sample) c(rnorm(100, 9, 2)))
    genesMatrix <- cbind(genesMatrix, sapply(1:25, function(sample)
                                 c(rnorm(75, 9, 2), rnorm(25, 14, 2))))
    classes <- factor(rep(c("Poor", "Good"), each = 25))
    colnames(genesMatrix) <- paste("Sample", 1:ncol(genesMatrix))
    rownames(genesMatrix) <- paste("Gene", 1:nrow(genesMatrix))
    
    resubstituteParams <- ResubstituteParams(nFeatures = seq(10, 100, 10),
                            performanceType = "balanced error", better = "lower")
    selected <- differentMeansSelection(genesMatrix, classes, "Example",
                               trainParams = TrainParams(), predictParams = PredictParams(),
                               resubstituteParams = resubstituteParams)
                               
    selected@chosenFeatures
  #}

ClassifyR documentation built on Nov. 8, 2020, 6:53 p.m.