Cross-validation for penalized PLS based on Spline Transformations

Description

Computes the nonlinear-regression model for penalized PLS based on B-Spline transformations.

Usage

1
2
3
4
ppls.splines.cv(X, y,lambda,
ncomp, degree, order,
nknot, k, kernel,scale,
reduce.knots,select)

Arguments

X

matrix of input data

y

vector of response data

lambda

vector of candidate parameters lambda for the penalty term. Default value is 1

ncomp

Number of PLS components, default value is min(nrow(X)-1,ncol(Z)), where Z denotes the transformed data obtained from the function X2s

degree

Degree of the splines. Default value is 3

order

Order of the differences to be computed for the penalty term. Default value is 2.

nknot

number of knots. Default value is 20 for all variables.

k

the number of splits in k-fold cross-validation. Default value is k=5.

kernel

Logical value. If kernel=TRUE, the kernelized version of penalized PLS is computed. Default value is kernel=FALSE

scale

logical value. If scale=TRUE, the X variables are standardized to have unit variance. Default value is FALSE

reduce.knots

Logical variable. If TRUE, the function assures that there the transformed data does not contain a constant column. Default value is FALSE.

select

Logical value. If select=TRUE, the function fits only one variable per iteration.

Details

This function computes the cv-optimal nonlinear regression model with Penalized Partial Least Squares. In a nutshell, the algorithm works as follows. Starting with a generalized additive model for the columns of X, each additive component is expanded in terms of a generous amount of B-Splines basis functions. The basis functions are determined via their degree and nknot, the number of knots. In order to prevent overfitting, the additive model is estimated via penalized PLS, where the penalty term penalizes the differences of a specified order. Consult Kraemer, Boulesteix, and Tutz (2008) for details.

A graphical tool for penalized PLS on splines-transformed data is provided by graphic.ppls.splines.

Value

error.cv

matrix of cross-validated errors. The rows correspond to the values of lambda, the columns correspond to the number of components.

lambda.opt

Optimal value of lambda

ncomp.opt

Optimal number of penalized PLS components

min.ppls

Cross-validated error for the optimal penalized PLS solution

Author(s)

Nicole Kraemer

References

N. Kraemer, A.-L. Boulsteix, and G. Tutz (2008). Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data. Chemometrics and Intelligent Laboratory Systems, 94, 60 - 69. http://dx.doi.org/10.1016/j.chemolab.2008.06.009

See Also

penalized.pls,penalized.pls.cv, graphic.ppls.splines

Examples

1
2
3
4
5
6
7
# this example does not make much sense, it only illustrates
# how to use the functions properly

X<-matrix(rnorm(100*5),ncol=5)
y<-sin(X[,1]) +X[,2]^2 + rnorm(100)
lambda<-c(0,1,10,100,1000)
cv.result<-ppls.splines.cv(X,y,ncomp=10,k=10,lambda=lambda)