bartlettSelection: Selection of Differential Variability with Bartlett Statistic

Description Usage Arguments Details Value Author(s) Examples

Description

Ranks features by largest Bartlett statistic and chooses the features which have best resubstitution performance.

Usage

1
2
3
4
5
6
7
8
  ## S4 method for signature 'matrix'
bartlettSelection(measurements, classes, ...)
  ## S4 method for signature 'DataFrame'
bartlettSelection(measurements, classes, datasetName,
                  trainParams, predictParams, resubstituteParams,
                  selectionName = "Bartlett Test", verbose = 3)
  ## S4 method for signature 'MultiAssayExperiment'
bartlettSelection(measurements, targets, ...)

Arguments

measurements

Either a matrix, DataFrame or MultiAssayExperiment containing the training data. For a matrix, the rows are features, and the columns are samples.

classes

Either a vector of class labels of class factor of the same length as the number of samples in measurements or if the measurements are of class DataFrame a character vector of length 1 containing the column name in measurement is also permitted. Not used if measurements is a MultiAssayExperiment object.

targets

If measurements is a MultiAssayExperiment, the names of the data tables to be used. "clinical" is also a valid value and specifies that numeric variables from the clinical data table will be used.

...

Variables not used by the matrix nor the MultiAssayExperiment method which are passed into and used by the DataFrame method.

datasetName

A name for the data set used. Stored in the result.

trainParams

A container of class TrainParams describing the classifier to use for training.

predictParams

A container of class PredictParams describing how prediction is to be done.

resubstituteParams

An object of class ResubstituteParams describing the performance measure to consider and the numbers of top features to try for resubstitution classification.

selectionName

A name to identify this selection method by. Stored in the result.

verbose

Default: 3. A number between 0 and 3 for the amount of progress messages to give. This function only prints progress messages if the value is 3.

Details

The calculation of the test statistic is performed by the bartlett.test function from the stats package.

Data tables which consist entirely of non-numeric data cannot be analysed. If measurements is an object of class MultiAssayExperiment, the factor of sample classes must be stored in the DataFrame accessible by the colData function with column name "class".

Value

An object of class SelectResult or a list of such objects, if the classifier which was used for determining the specified performance metric made a number of prediction varieties.

Author(s)

Dario Strbenac

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
  # Samples in one class with differential variability to other class.
  # First 20 genes are DV.
  genesRNAmatrix <- sapply(1:25, function(sample) c(rnorm(100, 9, 1)))
  moreVariable <- sapply(1:25, function(sample) rnorm(20, 9, 5))
  genesRNAmatrix <- cbind(genesRNAmatrix, rbind(moreVariable,
                          sapply(1:25, function(sample) rnorm(80, 9, 1))))
  colnames(genesRNAmatrix) <- paste("Sample", 1:50)
  rownames(genesRNAmatrix) <- paste("Gene", 1:100)
  genesSNPmatrix <- matrix(sample(c("None", "Missense"), 250, replace = TRUE),
                           ncol = 50)
  colnames(genesSNPmatrix) <- paste("Sample", 1:50)
  rownames(genesSNPmatrix) <- paste("Gene", 1:5)
  classes <- factor(rep(c("Poor", "Good"), each = 25))
  names(classes) <- paste("Sample", 1:50)
  genesDataset <- MultiAssayExperiment(list(RNA = genesRNAmatrix, SNP = genesSNPmatrix),
                                       colData = DataFrame(class = classes))
  # Wait for update to MultiAssayExperiment wideFormat function.  
  trainIDs <- paste("Sample", c(1:20, 26:45))
  genesDataset <- subtractFromLocation(genesDataset, training = trainIDs,
                                       targets = "RNA") # Exclude SNP data.
                                         
  resubstituteParams <- ResubstituteParams(nFeatures = seq(10, 100, 10),
                                           performanceType = "balanced error",
                                           better = "lower")
  bartlettSelection(genesDataset, datasetName = "Example", targets = "RNA",
                    trainParams = TrainParams(fisherDiscriminant),
                    predictParams = PredictParams(NULL,
                                        getClasses = function(result) result),
                    resubstituteParams = resubstituteParams)

ClassifyR documentation built on July 8, 2018, 2 a.m.