selwik: Heuristic selection of the dimension of PLSR models with a...
In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics

selwik

R Documentation

Heuristic selection of the dimension of PLSR models with a permutation test on scores

Description

The function helps selecting the dimension (i.e. nb. components) of PLSR models.

The method was proposed by Wiklund et al. 2007 and Faber et al. 2007. For a given PLS score t, the principle is to compare the observed covariance Cov(Y, t) (where Y is the response) to the distribution H0 of simulated Cov(Y, t) computed on randomly permuted data (in which the relation between Y and X is assumed being removed). A significant observed covariance compared to distribution H0 is expected indicating a meaningful dimension.

The method can be time-consuming, especially for large datasets, since permutations are conditional to each component taken successively (successive one-dimension PLSR). A one-dimension PLSR is firstly implemented, data Y are randomly permuted (referred to as "Y-scambling"), and distribution H0 is computed. Then, information contained in the first dimension is removed from the data by deflation, and a the next dimension is studied by a new one-dimension PLSR, and so on.

Wiklund et al. 2007 and Faber et al. 2007 presented the method for PLSR1 models only (univariate Y). The function extends the method to PLSR2 (multivariate Y).

The function returns the p-value of the on-side test, i.e. the proportion of distribution H0 higher than the observed covariance.

Usage


selwik(
    X, Y, ncomp, 
    algo = NULL, weights = NULL,
    nperm = 50, seed = NULL, 
    print = TRUE, 
    ...
    )

Arguments

`X`	A `n x p` matrix or data frame of variables.
`Y`	A `n x q` matrix or data frame, or vector of length `n` for PLS1, of responses.
`ncomp`	The maximal number of scores (i.e. components = latent variables) to be calculated.
`algo`	A function implementing a PLS. Default to `NULL` (`pls_kernel` is used).
`weights`	A vector of length `n` defining a priori weights to apply to the observations. Internally, weights are "normalized" to sum to 1. Default to `NULL` (weights are set to `1 / n`).
`nperm`	Number of random permutations.
`seed`	An integer defining the seed for the random simulation, or `NULL` (default). See `set.seed`.
`print`	Logical. If `TRUE`, fitting information are printed.
`...`	Optionnal arguments to pass in the function defined in `algo`.

Value

A list with outputs, see the examples.

References

Faber, N.M., Rajko, R., 2007. How to avoid over-fitting in multivariate calibrationâThe conventional validation approach and an alternative. Analytica Chimica Acta, Papers presented at the 10th International Conference on Chemometrics in Analytical Chemistry 595, 98-106. https://doi.org/10.1016/j.aca.2007.05.030

Wiklund, S., Nilsson, D., Eriksson, L., SjÃ¶strÃ¶m, M., Wold, S., Faber, K., 2007. A randomization test for PLS component selection. Journal of Chemometrics 21, 427â439. https://doi.org/10.1002/cem.1086

Examples


data(datcass)
Xr <- datcass$Xr
yr <- datcass$yr

z <- selwik(Xr, yr, ncomp = 20, nperm = 30)
names(z)
plot(z$ncomp, z$pval,
     type = "b", pch = 16, col = "#045a8d",
     xlab = "Nb components", ylab = "p-value",
     main = "Wiklund et al. test")
alpha <- .10
abline(h = alpha, col = "grey")
u <- which(z$pval >= alpha)
opt <- min(u) - 1
opt

mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.

mlesnoff/rnirs index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mlesnoff/rnirs
Dimension reduction, Regression and Discrimination for Chemometrics

selwik: Heuristic selection of the dimension of PLSR models with a...
In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics

Heuristic selection of the dimension of PLSR models with a permutation test on scores

Description

Usage

Arguments

Value

References

Examples

Related to selwik in mlesnoff/rnirs...

R Package Documentation

Browse R Packages

We want your feedback!

mlesnoff/rnirs Dimension reduction, Regression and Discrimination for Chemometrics

selwik: Heuristic selection of the dimension of PLSR models with a... In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics

Heuristic selection of the dimension of PLSR models with a permutation test on scores

Description

Usage

Arguments

Value

References

Examples

Related to selwik in mlesnoff/rnirs...

R Package Documentation

Browse R Packages

We want your feedback!

mlesnoff/rnirs
Dimension reduction, Regression and Discrimination for Chemometrics

selwik: Heuristic selection of the dimension of PLSR models with a...
In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics