tunePcaLda: Build a classifier with parameter tuning.

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/functions.R

Description

optimize the number of principal component to be used in LDA based on a cross-validation procedure.

Usage

1
2
3
4
5
 tunePcaLda(data, label, batch = NULL, nPC = 1:50, 
            optMerit = c("Accuracy", "Sensitivity")[2], 
            maximize = TRUE, 
            cv = c("CV", "BV")[2], 
            nPart = 10, ...)

Arguments

data

a data matrix, with samples saved in rows and features in columns.

label

a vector of response variables (i.e., group/concentration info), must be the same length as the number of samples.

batch

a vector of batch variables (i.e., batch/patient ID), must be given in case of cv='BV'. Ideally, this should be the identification of the samples at the highest hierarchy (e.g., the patient ID rather than the spectral ID). Ignored for cv='CV'.

nPC

a vector of integers, the candidate numbers of principal components to be used for LDA, out of which an optimal value will be selected.

optMerit

a character value, the name of the merit to be optimized. The mean sensitivity will be optimized if optMerit = "Sensitivity".

maximize

a boolean value, if or not maximize the merit.

cv

a character value, specifying the type of cross-validation.

nPart

an integer, the number of folds to be split for cross-validation. Equivelant to nFold of crossValidation for cv='CV' and to nBatch for cv='BV'. (NOTE: use nPart=0 for leave-one-batch out cross-validaiton).

...

parameters for crossValidation

Details

build a classifier using each value in nPC, of which the performance is evaluated with a normal k-fold or batch-wise cross-validation. The optimal number is selected as the one giving the maximal (maximize=TRUE) or minimal (maximize=FALSE) merit.

A two-layer cross-validation can be performed by using tunePcaLda as the method in crossValidation.

Value

A list of elements:

PCA

PCA model

LDA

LDA model built with the optimal number of principal components

nPC

the optimal number of principal components

Author(s)

Shuxia Guo, Thomas Bocklitz, Juergen Popp

References

S. Guo, T. Bocklitz, et al., Common mistakes in cross-validating classification models. Analytical methods 2017, 9 (30): 4410-4417.

See Also

crossValidation, tunePcaLda, lda, prcomp

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
  data(DATA)
  ### perform parameter tuning with a 3-fold cross-validaiton
  RES2 <- tunePcaLda(data=DATA$spec
                   ,label=DATA$labels
                   ,batch=DATA$batch
                   ,nPC=2:4
                   ,cv=c('CV', 'BV')[1]
                   ,nPart=3
                   ,optMerit=c('Accuracy', 'Sensitivity')[2]
                   ,center=TRUE
                   ,scale=FALSE)

rModeling documentation built on March 26, 2020, 7:48 p.m.