elasticNetGLMinterface: An Interface for glmnet Package's glmnet Function

Description Usage Arguments Details Value Author(s) Examples

Description

An elastic net GLM classifier uses a penalty which is a combination of a lasso penalty and a ridge penalty, scaled by a lambda value, to fit a sparse linear model to the data.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  ## S4 method for signature 'matrix'
elasticNetGLMtrainInterface(measurements, classes, ...)
  ## S4 method for signature 'DataFrame'
elasticNetGLMtrainInterface(measurements, classes, lambda = NULL,
                             ..., verbose = 3)
  ## S4 method for signature 'MultiAssayExperiment'
elasticNetGLMtrainInterface(measurements, targets = names(measurements), ...)
  ## S4 method for signature 'multnet,matrix'
elasticNetGLMpredictInterface(model, test, ...)
  ## S4 method for signature 'multnet,DataFrame'
elasticNetGLMpredictInterface(model, test, classes = NULL, lambda, ..., returnType = c("class", "score", "both"), verbose = 3)
  ## S4 method for signature 'multnet,MultiAssayExperiment'
elasticNetGLMpredictInterface(model, test, targets = names(test), ...)                                   

Arguments

measurements

Either a matrix, DataFrame or MultiAssayExperiment containing the training data. For a matrix, the rows are features, and the columns are samples. If of type DataFrame, the data set is subset to only those features of type integer.

classes

Either a vector of class labels of class factor of the same length as the number of samples in measurements or if the measurements are of class DataFrame a character vector of length 1 containing the column name in measurement is also permitted. Not used if measurements is a MultiAssayExperiment object.

lambda

The lambda value passed directly to glmnet if the training function is used or passed as s to predict.glmnet if the prediction function is used.

test

An object of the same class as measurements with no samples in common with measurements and the same number of features as it.

targets

If measurements is a MultiAssayExperiment, the names of the data tables to be used. "clinical" is also a valid value and specifies that integer variables from the clinical data table will be used.

...

Variables not used by the matrix nor the MultiAssayExperiment method which are passed into and used by the DataFrame method (e.g. verbose) or, for the training function, options that are used by the glmnet function. For the testing function, this variable simply contains any parameters passed from the classification framework to it which aren't used by glmnet's predict fuction.

model

A trained elastic net GLM, as created by the glmnet function.

returnType

Default: "class". Either "class", "score" or "both". Sets the return value from the prediction to either a vector of class labels, matrix of scores for each class, or both labels and scores in a data.frame.

verbose

Default: 3. A number between 0 and 3 for the amount of progress messages to give. This function only prints progress messages if the value is 3.

Details

If measurements is an object of class MultiAssayExperiment, the factor of sample classes must be stored in the DataFrame accessible by the colData function with column name "class".

The value of the family parameter is fixed to "multinomial" so that classification with more than 2 classes is possible and type.multinomial is fixed to "grouped" so that a grouped lasso penalty is used. During classifier training, if more than one lambda value is considered by specifying a vector of them as input or leaving the default value of NULL, then the chosen value is determined based on classifier resubstitution error rate.

Value

For elasticNetGLMtrainInterface, an object of type glmnet. For elasticNetGLMpredictInterface, either a factor vector of predicted classes, a matrix of scores for each class, or a table of both the class labels and class scores, depending on the setting of returnType.

Author(s)

Dario Strbenac

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
  if(require(glmnet))
  {
    # Genes 76 to 100 have differential expression.
    genesMatrix <- sapply(1:25, function(sample) c(rnorm(100, 9, 2)))
    genesMatrix <- cbind(genesMatrix, sapply(1:25, function(sample)
                                      c(rnorm(75, 9, 2), rnorm(25, 14, 2))))
    classes <- factor(rep(c("Poor", "Good"), each = 25))
    colnames(genesMatrix) <- paste("Sample", 1:ncol(genesMatrix))
    rownames(genesMatrix) <- paste("Gene", 1:nrow(genesMatrix))
    
    resubstituteParams <- ResubstituteParams(nFeatures = seq(10, 100, 10),
                                             performanceType = "balanced error",
                                             better = "lower")
                                             
    # lambda is automatically tuned, based on glmnet defaults, if not user-specified.                 
    trainParams <- TrainParams(elasticNetGLMtrainInterface, nlambda = 500,
                               getFeatures = elasticNetFeatures)
    predictParams <- PredictParams(elasticNetGLMpredictInterface)                           
    classified <- runTests(genesMatrix, classes, datasetName = "Example",
                           classificationName = "Differential Expression",
                           validation = "fold",
                           params = list(trainParams, predictParams))
                           
    classified <- calcCVperformance(classified, "balanced error")
    head(tunedParameters(classified))
    performance(classified)
  }

ClassifyR documentation built on Nov. 8, 2020, 6:53 p.m.