knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The plsdof package provides Degrees of Freedom estimates for Partial Least Squares (PLS) Regression. Model selection for PLS is based on various information criteria (aic, bic, gmdl) or on cross-validation. Estimates for the mean and covariance of the PLS regression coefficients are available. They allow the construction of approximate confidence intervals and the application of test procedures. Further, cross-validation procedures for Ridge Regression and Principal Components Regression are available.
The plsdof package was fully coded and developped by Nicole Kraemer and Mikio L. Braun. It is mainly based on the article by N. Kraemer, M. Sugiyama (2012): "The Degrees of Freedom of Partial Least Squares Regression", Journal of the American Statistical Association, 106(494):697-705, doi:10.1198/jasa.2011.tm10107.
Yet due to the regular updates in CRAN policies, it was removed from the CRAN and orphaned since the former maintainer had stopped updating the package. The plsdof package is required by several packages of Frédéric Bertrand who was then selected as the new maintainer since late 2018.
This website and these examples were created by F. Bertrand.
You can install the released version of plsdof from CRAN with:
install.packages("plsdof")
You can install the development version of plsdof from github with:
devtools::install_github("fbertran/plsdof")
The pls.model function computes the Partial Least Squares fit.
n<-50 # number of observations p<-15 # number of variables X<-matrix(rnorm(n*p),ncol=p) y<-rnorm(n) ntest<-200 # Xtest<-matrix(rnorm(ntest*p),ncol=p) # test data ytest<-rnorm(ntest) # test data library(plsdof) # compute PLS + degrees of freedom + prediction on Xtest first.object<-pls.model(X,y,compute.DoF=TRUE,Xtest=Xtest,ytest=NULL) # compute PLS + test error second.object=pls.model(X,y,m=10,Xtest=Xtest,ytest=ytest)
The pls.ic function computes the optimal model parameters using one of three different model selection criteria (aic, bic, gmdl) and based on two different Degrees of Freedom estimates for PLS.
n<-50 # number of observations p<-5 # number of variables X<-matrix(rnorm(n*p),ncol=p) y<-rnorm(n) # compute linear PLS pls.object<-pls.ic(X,y,m=ncol(X))
Creating response vector and predictors' matrix
data(Boston) X<-as.matrix(Boston[,-14]) y<-as.vector(Boston[,14])
Compute PLS coefficients for the first 5 components.
my.pls1<-pls.model(X,y,m=5,compute.DoF=TRUE) my.pls1
Plot Degrees of Freedom and add naive estimate.
plot(0:5,my.pls1$DoF,pch="*",cex=3,xlab="components",ylab="DoF",ylim=c(0,14)) lines(0:5,1:6,lwd=3)
Model selection with the Bayesian Information criterion
my.pls2<-pls.ic(X,y,criterion="bic") my.pls2
Model selection based on cross-validation.
my.pls3<-pls.cv(X,y,compute.covariance=TRUE) my.pls3
Returns the estimated covariance matrix of the regression coefficients
my.vcov<-vcov(my.pls3) my.vcov
Standard deviation of the regression coefficients
my.sd<-sqrt(diag(my.vcov)) my.sd
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.