cv.plsR | R Documentation |
This function implements k-fold cross-validation on complete or incomplete datasets for partial least squares regression models
cv.plsR(object, ...)
## Default S3 method:
cv.plsRmodel(object,dataX,nt=2,limQ2set=.0975,modele="pls",
K=5, NK=1, grouplist=NULL, random=TRUE, scaleX=TRUE,
scaleY=NULL, keepcoeffs=FALSE, keepfolds=FALSE, keepdataY=TRUE,
keepMclassed=FALSE, tol_Xi=10^(-12), weights, verbose=TRUE,...)
## S3 method for class 'formula'
cv.plsRmodel(object,data=NULL,nt=2,limQ2set=.0975,modele="pls",
K=5, NK=1, grouplist=NULL, random=TRUE, scaleX=TRUE,
scaleY=NULL, keepcoeffs=FALSE, keepfolds=FALSE, keepdataY=TRUE,
keepMclassed=FALSE, tol_Xi=10^(-12), weights,subset,contrasts=NULL, verbose=TRUE,...)
PLS_lm_kfoldcv(dataY, dataX, nt = 2, limQ2set = 0.0975, modele = "pls",
K = 5, NK = 1, grouplist = NULL, random = TRUE, scaleX = TRUE,
scaleY = NULL, keepcoeffs = FALSE, keepfolds = FALSE, keepdataY = TRUE,
keepMclassed=FALSE, tol_Xi = 10^(-12), weights, verbose=TRUE)
PLS_lm_kfoldcv_formula(formula,data=NULL,nt=2,limQ2set=.0975,modele="pls",
K=5, NK=1, grouplist=NULL, random=TRUE, scaleX=TRUE,
scaleY=NULL, keepcoeffs=FALSE, keepfolds=FALSE, keepdataY=TRUE,
keepMclassed=FALSE, tol_Xi=10^(-12), weights,subset,contrasts=NULL,verbose=TRUE)
object |
response (training) dataset or an object of class " |
dataY |
response (training) dataset |
dataX |
predictor(s) (training) dataset |
formula |
an object of class " |
data |
an optional data frame, list or environment (or object coercible by |
nt |
number of components to be extracted |
limQ2set |
limit value for the Q2 |
modele |
name of the PLS model to be fitted, only ( |
K |
number of groups. Defaults to 5. |
NK |
number of times the group division is made |
grouplist |
to specify the members of the |
random |
should the |
scaleX |
scale the predictor(s) : must be set to TRUE for |
scaleY |
scale the response : Yes/No. Ignored since non always possible for glm responses. |
keepcoeffs |
shall the coefficients for each model be returned |
keepfolds |
shall the groups' composition be returned |
keepdataY |
shall the observed value of the response for each one of the predicted value be returned |
keepMclassed |
shall the number of miss classed be returned |
tol_Xi |
minimal value for Norm2(Xi) and |
weights |
an optional vector of 'prior weights' to be used in the fitting process. Should be |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
contrasts |
an optional list. See the |
verbose |
should info messages be displayed ? |
... |
arguments to pass to |
Predicts 1 group with the K-1
other groups. Leave one out cross validation is thus obtained for K==nrow(dataX)
.
A typical predictor has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response. A terms specification of the form first + second indicates all the terms in first together with all the terms in second with any duplicates removed.
A specification of the form first:second indicates the the set of terms obtained by taking the interactions of all terms in first with all terms in second. The specification first*second indicates the cross of first and second. This is the same as first + second + first:second.
The terms in the formula will be re-ordered so that main effects come first, followed by the interactions, all second-order, all third-order and so on: to avoid this pass a terms object as the formula.
Non-NULL weights can be used to indicate that different observations have different dispersions (with the values in weights being inversely proportional to the dispersions); or equivalently, when the elements of weights are positive integers w_i, that each response y_i is the mean of w_i unit-weight observations.
An object of class "cv.plsRmodel"
.
results_kfolds |
list of
|
folds |
list of
|
dataY_kfolds |
list of
|
call |
the call of the function |
Work for complete and incomplete datasets.
Frederic Bertrand
frederic.bertrand@utt.fr
https://fbertran.github.io/homepage/
Nicolas Meyer, Myriam Maumy-Bertrand et Frederic Bertrand (2010). Comparing the linear and the logistic PLS regression with qualitative predictors: application to allelotyping data. Journal de la Societe Francaise de Statistique, 151(2), pages 1-18. http://publications-sfds.math.cnrs.fr/index.php/J-SFdS/article/view/47
Summary method summary.cv.plsRmodel
. kfolds2coeff
, kfolds2Pressind
, kfolds2Press
, kfolds2Mclassedind
, kfolds2Mclassed
and kfolds2CVinfos_lm
to extract and transform results from k-fold cross-validation.
data(Cornell)
XCornell<-Cornell[,1:7]
yCornell<-Cornell[,8]
#Leave one out CV (K=nrow(Cornell)) one time (NK=1)
bbb <- cv.plsR(object=yCornell,dataX=XCornell,nt=6,K=nrow(Cornell),NK=1)
bbb2 <- cv.plsR(Y~.,data=Cornell,nt=6,K=12,NK=1,verbose=FALSE)
(sum1<-summary(bbb2))
#6-fold CV (K=6) two times (NK=2)
#use random=TRUE to randomly create folds for repeated CV
bbb3 <- cv.plsR(object=yCornell,dataX=XCornell,nt=6,K=6,NK=2)
bbb4 <- cv.plsR(Y~.,data=Cornell,nt=6,K=6,NK=2,verbose=FALSE)
(sum3<-summary(bbb4))
cvtable(sum1)
cvtable(sum3)
rm(list=c("XCornell","yCornell","bbb","bbb2","bbb3","bbb4"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.