Classification by Ridge Iteratively Reweighted Least Squares followed by Adaptive Sparse PLS regression for binary response
Description
The function rirls.spls
performs compression, variable selection and classification (with possible prediction)
using Durif et al. (2015) RIRLSSPLS algorithm.
Usage
1 2  rirls.spls(Xtrain, Ytrain, lambda.ridge, lambda.l1, ncomp,
Xtest=NULL, adapt=TRUE, maxIter=100, svd.decompose=TRUE)

Arguments
Xtrain 
a (ntrain x p) data matrix of predictors. 
Ytrain 
a ntrain vector of responses. 
Xtest 
a (ntest x p) matrix containing the predictors for the test data set.

lambda.ridge 
a positive real value. 
lambda.l1 
a positive real value, in [0,1]. 
ncomp 
a positive integer. 
adapt 
a boolean value, indicating whether the sparse PLS selection step sould be adaptive or nor. 
maxIter 
a positive integer. 
svd.decompose 
a boolean parameter. 
Details
The columns of the data matrices Xtrain
and Xtest
may not be standardized,
since standardizing is performed by the function rirls.spls
as a preliminary step
before the algorithm is run.
The procedure described in Durif et al. (2015) is used to determine
latent components to be used for classification and when Xtest
is not equal to NULL, the procedure predicts the labels for these new
predictor variables.
Value
A list with the following components:
Coefficients 
the (p+1) vector containing the coefficients of the design matrix and intercept in the logistic model explaining the response Y. 
hatY 
the (ntrain) vector containing the estimated reponse value on the train set of predictors Xtrain. 
hatYtest 
the (ntest) vector containing the predicted labels for the observations from

DeletedCol 
the vector containing the column number of 
A 
the active set of predictors selected by the procedures. 
converged 
a {0,1} value indicating whether the IRLS algorithm converged
in less than 
X.score 
a (n x ncomp) matrix being the observations coordinates or scores in the
new component basis produced by the compression step (sparse PLS). Each column t.k of

X.weight 
a (p x ncomp) matrix being the coefficients of predictors in each components
produced by sparse PLS. Each column w.k of 
Xtrain 
the design matrix. 
sXtrain 
the scaled design matrix. 
Ytrain 
the response observations. 
sPseudoVar 
the scaled pseudoresponse as produced by the RIRLSalgorithm and then being scaled. 
lambda.ridge 
the ridge hyperparameter used to fit the model. 
lambda.l1 
the sparse hyperparameter used to fit the model. 
ncomp 
the number of components used to fit the model. 
V 
the (ntrain x ntrain) matrix used to weight the metric in the sparse PLS step.

proba.test 
the (ntest) vector of estimated probabilities for the observations in

Author(s)
Ghislain Durif (http://lbbe.univlyon1.fr/DurifGhislain.html).
Adapted in part from rpls code by S. LambertLacroix (function available in this package).
References
G. Durif, F. Picard, S. LambertLacroix (2015). Adaptive sparse PLS for logistic regression, (in prep), available on (http://arxiv.org/abs/1502.05933).
See Also
rirls.spls.tune
.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  ### load plsgenomics library
library(plsgenomics)
### generating data
n < 50
p < 100
sample1 < sample.bin(n=n, p=p, kstar=20, lstar=2, beta.min=0.25, beta.max=0.75,
mean.H=0.2, sigma.H=10, sigma.F=5)
X < sample1$X
Y < sample1$Y
### splitting between learning and testing set
index.train < sort(sample(1:n, size=round(0.7*n)))
index.test < (1:n)[index.train]
Xtrain < X[index.train,]
Ytrain < Y[index.train,]
Xtest < X[index.test,]
Ytest < Y[index.test,]
### fitting the model, and predicting new observations
model1 < rirls.spls(Xtrain=Xtrain, Ytrain=Ytrain, lambda.ridge=2, lambda.l1=0.5, ncomp=2,
Xtest=Xtest, adapt=TRUE, maxIter=100, svd.decompose=TRUE)
str(model1)
### prediction error rate
sum(model1$hatYtest!=Ytest) / length(index.test)
