MVA.test: Significance test based on cross (model) validation

View source: R/MVA.test.R

MVA.testR Documentation

Significance test based on cross (model) validation

Description

Performs a permutation significance test based on cross (model) validation with different PLS and/or discriminant analyses. See MVA.cv and MVA.cmv for more details about how cross (model) validation is performed.

Usage

MVA.test(X, Y, cmv = FALSE, ncomp = 8, kout = 7, kinn = 6, scale = TRUE,
  model = c("PLSR", "CPPLS", "PLS-DA", "PPLS-DA", "LDA", "QDA", "PLS-DA/LDA",
  "PLS-DA/QDA", "PPLS-DA/LDA","PPLS-DA/QDA"), Q2diff = 0.05, lower = 0.5,
  upper = 0.5, Y.add = NULL, weights = rep(1, nrow(X)), set.prior = FALSE,
  crit.DA = c("plug-in", "predictive", "debiased"), p.method = "fdr",
  nperm = 999, progress = TRUE, ...)

Arguments

X

a data frame of independent variables.

Y

the dependent variable(s): numeric vector, data frame of quantitative variables or factor.

cmv

a logical indicating if the values (Q2 or NMC) should be generated through cross-validation (classical K-fold process) or cross model validation (inner + outer loops).

ncomp

an integer giving the number of components to be used to generate all submodels (cross-validation) or the maximal number of components to be tested in the inner loop (cross model validation). Can be re-set internally if needed. Does not concern LDA and QDA.

kout

an integer giving the number of folds (cross-validation) or the number of folds in the outer loop (cross-model validation). Can be re-set internally if needed.

kinn

an integer giving the number of folds in the inner loop (cross model validation only). Can be re-set internally if needed. Cannot be > kout.

scale

logical indicating if data should be scaled. See help of MVA.cv and MVA.cmv.

model

the model to be fitted.

Q2diff

the threshold to be used if the number of components is chosen according to Q2 (cross model validation only).

lower

a vector of lower limits for power optimisation in CPPLS or PPLS-DA (see cppls.fit).

upper

a vector of upper limits for power optimisation in CPPLS or PPLS-DA (see cppls.fit).

Y.add

a vector or matrix of additional responses containing relevant information about the observations, in CPPLS or PPLS-DA (see cppls.fit).

weights

a vector of individual weights for the observations, in CPPLS or PPLS-DA (see cppls.fit).

set.prior

only used when a LDA or QDA is performed (coupled or not with a PLS model). If TRUE, the prior probabilities of class membership are defined according to the mean weight of individuals belonging to each class. If FALSE, prior probabilities are obtained from the data sets on which LDA/QDA models are built.

crit.DA

criterion used to predict class membership when a LDA or QDA is used. See predict.lda.

p.method

method for p-values correction. See help of p.adjust.

nperm

number of permutations.

progress

logical indicating if the progress bar should be displayed.

...

other arguments to pass to plsr (PLSR, PLS-DA) or cppls (CPPLS, PPLS-DA).

Details

When Y consists in quantitative response(s), the null hypothesis is that each response is not predicted better than what would happen by chance. In this case, Q2 is used as the test statistic. When Y contains several responses, a p-value is computed for each response and p-values are corrected for multiple testing.

When Y is a factor, the null hypothesis is that the factor has no discriminant ability. In this case, the classification error rate (NMC) is used as the test statistic.

Whatever the response, the reference value of the test statistics is obtained by averaging 20 values coming from independently performed cross (model) validation on the original data.

The function deals with the limitted floating point precision, which can bias calculation of p-values based on a discrete test statistic distribution.

Value

method

a character string indicating the name of the test.

data.name

a character string giving the name(s) of the data, plus additional information.

statistic

the value of the test statistics.

permutations

the number of permutations.

p.value

the p-value of the test.

p.adjust.method

a character string giving the method for p-values correction.

Author(s)

Maxime HERVE <maxime.herve@univ-rennes1.fr>

References

Westerhuis J, Hoefsloot HCJ, Smit S, Vis DJ, Smilde AK, van Velzen EJJ, van Duijnhoven JPM and van Dorsten FA (2008) Assessment of PLSDA cross validation. Metabolomics 4:81-89.

See Also

MVA.cv, MVA.cmv

Examples

require(pls)
require(MASS)

# PLSR
data(yarn)
## Not run: MVA.test(yarn$NIR,yarn$density,cmv=TRUE,model="PLSR")

# PPLS-DA coupled to LDA
data(mayonnaise)
## Not run: MVA.test(mayonnaise$NIR,factor(mayonnaise$oil.type),model="PPLS-DA/LDA")

RVAideMemoire documentation built on Nov. 6, 2023, 5:07 p.m.