# kplsr: Non linear kernel PCR and PLSR Models In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics

 kplsr R Documentation

## Non linear kernel PCR and PLSR Models

### Description

KPCR and KPLSR

Function `kpcr` fits a KPCR model, i.e. a regression on latent variables (scores) of KPCA (Scholkopf et al. 1997, Scholkopf & Smola 2002, Tipping 2001), computed by `kpca`.

Function `kplsr` fits KPLSR models with the NIPALS algorithm (implemented in `kpls_nipals`) such as described in Rosipal & Trejo (2001).

The kernel Gram matrices `K` are internally centered before the analyses, but the data are not column-wise scaled (there is no argument `scale` in the function). If needed, the user has to do the scaling before using the function.

Row observations can eventually be weighted with a priori weights (using argument `weights`).

DKPLSR

The true kernel algorithms above are time expensive when `n > 500`, especially KPLSR due to the iterative deflation of the `n x n` training Gram matrix `K`. A much faster alternative to KPLSR is to run a "direct kernel PLSR" (DKPLSR) (Bennett & Embrechts 2003), i.e. to build preliminary kernel Gram matrices (such as doing a pre-processing on `X`), and then to run a usual PLSR algorithm on them. This is what does function `dkplsr`. See also examples in function `kgram`.

See also the tuning facility with `splitpar`.

### Usage

``````
kpcr(Xr, Yr, Xu, Yu = NULL, ncomp,
kern = kpol, weights = NULL, print = TRUE, ...)

kplsr(Xr, Yr, Xu, Yu = NULL, ncomp,
kern = kpol, weights = NULL, print = TRUE, ...)

dkplsr(Xr, Yr, Xu, Yu = NULL, ncomp,
kern = kpol, weights = NULL, print = TRUE, ...)

``````

### Arguments

 `Xr` A `n x p` matrix or data frame of reference (= training) observations. `Yr` A `n x q` matrix or data frame, or a vector of length `n`, of reference (= training) responses. `Xu` A `m x p` matrix or data frame of new (= test) observations to predict. `Yu` A `m x q` matrix or data frame, or a vector of length `m`, of the true responses for `Xu`. Default to `NULL`. `ncomp` The number of scores (= components = latent variables) to consider. `kern` A function defining the considered kernel (Default to `kpol`). See `kpol` for syntax, and other available kernel functions. `weights` A vector of length `n` defining a priori weights to apply to the observations. Internally, weights are "normalized" to sum to 1. Default to `NULL` (weights are set to `1 / n`). `print` Logical (default = `TRUE`). If `TRUE`, fitting information are printed. `...` Optionnal arguments to pass in the kernel function defined in `kern`. The value set in the kernel parameters (e.g. `degree` for `kpol`) can be a scalar or a vector of several values.

### Value

A list of outputs (see examples), such as:

 `y` Responses for the test data. `fit` Predictions for the test data. `r` Residuals for the test data.

### References

Bennett, K.P., Embrechts, M.J., 2003. An optimization perspective on kernel partial least squares regression, in: Advances in Learning Theory: Methods, Models and Applications, NATO Science Series III: Computer & Systems Sciences. IOS Press Amsterdam, pp. 227-250.

Rosipal, R., Trejo, L.J., 2001. Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. Journal of Machine Learning Research 2, 97-123.

Scholkopf, B., Smola, A., MÃ¼ller, K.-R., 1997. Kernel principal component analysis, in: Gerstner, W., Germond, A., Hasler, M., Nicoud, J.-D. (Eds.), Artificial Neural Networks â ICANN 97, Lecture Notes in Computer Science. Springer, Berlin, Heidelberg, pp. 583-588. https://doi.org/10.1007/BFb0020217

Scholkopf, B., Smola, A.J., 2002. Learning with kernels: support vector machines, regularization, optimization, and beyond, Adaptive computation and machine learning. MIT Press, Cambridge, Mass.

Tipping, M.E., 2001. Sparse kernel principal component analysis. Advances in neural information processing systems, MIT Press. http://papers_nips.cc/paper/1791-sparse-kernel-principal-component-analysis.pdf

### Examples

``````
n <- 10
p <- 6
set.seed(1)
X <- matrix(rnorm(n * p, mean = 10), ncol = p)
y1 <- 100 * rnorm(n)
y2 <- 100 * rnorm(n)
Y <- cbind(y1, y2)
set.seed(NULL)

Xr <- X[1:8, ] ; Yr <- Y[1:8, ]
Xu <- X[9:10, ] ; Yu <- Y[9:10, ]

ncomp <- 3
fm <- kpcr(Xr, Yr, Xu, Yu, ncomp = ncomp, degree = 3)
#fm <- kpcr(Xr, Yr, Xu, Yu, ncomp = ncomp, degree = 3)
names(fm)
z <- mse(fm, ~ ncomp)
z[z\$rmsep == min(z\$rmsep), ]
plotmse(z)

## fictive weights
kplsr(Xr, Yr, Xu, Yu, ncomp = ncomp, weights = 1:nrow(Xr))
#dkplsr(Xr, Yr, Xu, Yu, ncomp = ncomp, weights = 1:nrow(Xr))   ## DKPLSR

####### Example of fitting the function sinc(x) (Rosipal & Trejo 2001 p. 105-106)

x <- seq(-10, 10, by = .2)
x[x == 0] <- 1e-5
n <- length(x)
zy <- sin(abs(x)) / abs(x)
y <- zy + rnorm(n, 0, .2)
plot(x, y, type = "p")
lines(x, zy, lty = 2)
Xu <- Xr <- matrix(x, ncol = 1)

ncomp <- 3
fm <- kplsr(Xr, y, Xu, ncomp = ncomp, kern = krbf)
#fm <- kplsr(Xr, y, Xu, ncomp = ncomp, kern = krbf)   ## DKPLSR
fit <- fm\$fit\$y1[fm\$fit\$ncomp == ncomp]
plot(Xr, y, type = "p")
lines(Xr, zy, lty = 2)
lines(Xu, fit, col = "red")

``````

mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.