Description Usage Arguments Details Value Author(s) References See Also Examples
Partial least squares regression with backward selection of predictors
1 2 3 4 
formula 
model formula 
data 
optional data frame with the data to fit the model 
testset 
optional vector defining a test set (row indices) 
tselect 
string specifying the role of the test set in model selection
( 
prep 
character. optional preprocessing (only one choice implemented:

val 
character. Validation used ( 
scaling 
logical. if 
stingy 
logical. If 
verbose 
logical. If 
backselect 
one or more character strings defining the methods used in backwards
selection (see details). 
jt.thresh 
threshold used in predictor selections that are based on jackknife testing (methods based on A1, see details) 
vip.thresh 
threshold used in predictor selections that are based on VIP (methods based on A2, see details). VIP is scaled to a maximum of 1. 
jump 
numeric. If a number is given, backward selection starts with a forced reduction of predictors to the given number (see A0 in details). This reduction is based on significance in jackknifing. The argument can be useful in the case of large predictor matrices. 
lower 
numeric. Backward selection proceeds as long as R2 in validation reaches the given value (experimental, backward selection continues further if models improve in other respects such as decreasing numbers of latent vectors). 
method 
character string indicating what plsr method to use. 
The autopls
function is a wrapper for pls
in package pls written by BjørnHelge Mevik, Ron Wehrens
and Kristian Hovde Liland. As for now, the wrapper can be cited as
Schmidtlein et al. (2012). autopls
works only for single target
variables.
If validation = “CV”, 10fold crossvalidation is performed. If
validation = “LOO”, leaveoneout crossvalidation is performed.
Test set validation takes always place if a test set has been defined.
tselect
specifies how the test set is used in model selection.
"none"
: just use it for external validation; "passive"
:
use error in external validation for model selection but do not use it
for the determination of the number of latent vectors; "active"
use the error in external validation for model selection and for the
determination of the number of latent vectors. With stingy = TRUE
the errors that are used in the selection are measured at a number of latent
vectors that depends on the number of observations (1/10 at maximum).
Otherwise, the number of latent vectors is chosen where errors approach
a first minimum. In order to avoid minor local minima the error values are
first smoothed.
Large data matrices: Examine the arguments jump
(forced reduction of
predictors in the first iteration). Large model objects can be
shrinked using the function slim
but some functionality (like
plotting or change of the number of latent vectors) is lost. Shrinked models
can still be used for predictions.
Preprocessing options: The only implemented option is currently
"bn"
, which is a brightness normalization according to
Feilhauer et al. (2010).
Several methods for predictor selection are available. In default mode
(backselect = "auto"
) the selection follows an optimization procedure
using methods A1 and A3. However, apart from A0 any userdefined
combination can be selected using the backselect
argument. Note that
VIPbased methods (A2, A3, B3 to B6) are meant to be used with the oscorespls
method and methods B1 to B6 and C1 do only make sense with sequences of
spectral bands or similar sequences of autocorrelated predictors. The methods
are coded as follows:
A) Filtering based on thresholds
(A0 and A1) Based on significance, A0 with userdefined threshold (see
argument jump
); (A2) based on VIP; (A3) based on combined
significance and VIP; (A4) removal of 10 % predictors with the lowest
significance; (A5) removal of 25 % predictors with the lowest significance.
B) Filtering followed by reduction of autocorrelation
(B1) Filtering based on significance, thinning starting with local maxima in weighted regression coefficients; (B2) filtering based on significance, thinning starting with local maxima in significance; (B3) filtering based on significance, thinning starting with local maxima in VIP; (B4) filtering based on VIP, thinning starting with local maxima in weighted regression coefficients; (B5) filtering based on VIP, thinning starting with local maxima in significance; (B6) filtering based on VIP, thinning starting with local maxima in VIP.
C) Just reduction of autocorrelation
(C1): reduction starting with local maxima in regression coefficients.
An object of class autopls
is returned. This equals a
pls object and some added objects:
predictors 
logical. Vector of predictors that have been or have not been used in the current model 
metapls 
outcomes of the backward selection process 
iterations 
models selected during the backward selection process 
The $metapls
item consists of the following:
current.iter 
iteration of the backward selection procedure the current model is based upon 
autopls.iter 
iteration of the backward selection procedure originally selected by autopls 
current.lv 
number of latent vectors the current model is based upon 
autopls.lv 
number of latent vectors originally selected by autopls 
lv.history 
sequence of number of latent vectors values selected during iterations in backward selection 
rmse.history 
sequence of root mean squared errors obtained during
iterations in backward selection. Errors are reported for
calibration and validation. The validation errors are also
reported for the number of latent vectors corresponding to

r2.history 
sequence of number of r2 values obtained during iterations in backward selection 
X 
original predictors 
Y 
original target variable 
X.testset 
test set: predictors 
Y.testset 
test set: target variable 
preprocessing 
method used for preprocessing 
scaling 

val 

call 
the function call 
Sebastian Schmidtlein with contributions from Carsten Oldenburg
and Hannes Feilhauer. The code for computing VIP
is borrowed from
BjørnHelge Mevik.
Feilhauer. H., Asner, G.P., Martin, R.E., Schmidtlein, S. (2010): Brightnessnormalized Partial Least Squares regression for hyperspectral data. Journal of Quantitative Spectroscopy and Radiative Transfer 111: 1947–1957.
Schmidtlein, S., Feilhauer, H., Bruelheide, H. (2012): Mapping plant strategy types using remote sensing. Journal of Vegetation Science 23: 395–405. Open Access.
pls
, set.iter
, set.lv
,
predict.autopls
, plot.autopls
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16  ## load predictor and response data to the current environment
data (murnau.X)
data (murnau.Y)
## call autopls with the standard options
model < autopls (murnau.Y ~ murnau.X)
## S3 plot method
## Not run: plot (model)
## Not run: plot (model, type = "rc")
## Loading and score plots
## Not run: plot (model$loadings, main = "Loadings")
## Not run: plot (model$loadings [,c(1,3)], main = "Loadings")
## Not run: plot (model$scores, main = "Scores")

Loading required package: pls
Attaching package: 'pls'
The following object is masked from 'package:stats':
loadings
autopls 1.3
1 Pred: 26 LV: 3 R2v: 0.74 RMSEv: 4.727
2 Pred: 23 LV: 3 R2v: 0.742 RMSEv: 4.705 Criterion: A1
3 Pred: 20 LV: 3 R2v: 0.749 RMSEv: 4.645 Criterion: A4
4 Pred: 18 LV: 3 R2v: 0.752 RMSEv: 4.611 Criterion: A4
5 Pred: 16 LV: 3 R2v: 0.752 RMSEv: 4.61 Criterion: A4
6 Pred: 13 LV: 3 R2v: 0.76 RMSEv: 4.537 Criterion: A1
7 Pred: 11 LV: 3 R2v: 0.768 RMSEv: 4.466 Criterion: A4
8 Pred: 9 LV: 3 R2v: 0.775 RMSEv: 4.397 Criterion: A4
Predictors: 9 Observations: 40 Latent vectors: 3 Run: 8
RMSE(CAL): 4.09 RMSE(LOO): 4.4
R2(CAL): 0.805 R2(LOO): 0.775
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.