Cross validation method for PRM regression models.

Share:

Description

k-fold cross validation for the selection of the number of components for partial robust M regression.

Usage

1
2
3
prmsCV(formula, data, as, nfold = 10, fun = "Hampel", probp1 = 0.95, hampelp2 = 0.975,
hampelp3 = 0.999, center = "median", scale = "qn", usesvd = FALSE, plot = TRUE, 
numit = 100, prec = 0.01, alpha = 0.15)

Arguments

formula

an object of class formula.

data

a data frame or list which contains the variables given in formula.

as

a vector with positive integers, which are the number of PRM components to be estimated in the models.

nfold

the number of folds used for cross validation, default is nford=10 for 10-fold CV.

fun

an internal weighting function for case weights. Choices are "Hampel" (preferred), "Huber" or "Fair".

probp1

the 1-alpha value at which to set the first outlier cutoff for the weighting function.

hampelp2

the 1-alpha values for second cutoff. Only applies to fun="Hampel".

hampelp3

the 1-alpha values for third cutoff. Only applies to fun="Hampel".

center

type of centering of the data in form of a string that matches an R function, e.g. "mean" or "median".

scale

type of scaling for the data in form of a string that matches an R function, e.g. "sd" or "qn" or alternatively "no" for no scaling.

usesvd

logical, default is FALSE. If TRUE singular value decomposition is performed.

plot

logical, default is TRUE. If TRUE a plot is generated with a measure of the prediction accuracy for each model (see Details).

numit

the number of maximal iterations for the convergence of the coefficient estimates.

prec

a value for the precision of estimation of the coefficients.

alpha

value used for alpha trimmed mean squared error, which is the cross validation criterion (see Details).

Details

The alpha - trimmed mean squared error of the predicted response over all observations is used as robust decision criterion to choose the optimal model. For plot=TRUE a graphic visualizes the alpha - trimmed mean squared error for each model.

Value

opt.mod

object of class prm. (see prms)

spe

matrix with squared prediction error for each observation and each number of components.

Author(s)

Irene Hoffmann

References

Hoffmann, I., Serneels, S., Filzmoser, P., Croux, C. (2015). Sparse partial robust M regression. Chemometrics and Intelligent Laboratory Systems, 149, 50-59.

Serneels, S., Croux, C., Filzmoser, P., Van Espen, P.J. (2005). Partial Robust M-Regression. Chemometrics and Intelligent Laboratory Systems, 79, 55-64.

See Also

prms, plot.prm, predict.prm, sprmsCV

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
set.seed(5023)
U <- c(rep(2,20), rep(5,30))
X <- replicate(6, U+rnorm(50))
beta <- c(rep(1, 3), rep(-1,3))
e <- c(rnorm(45,0,1.5),rnorm(5,-20,1))
y <- X%*%beta + e
d <- as.data.frame(X)
d$y <- y
res <- prmsCV(y~., data=d, as=2:4, plot=TRUE, prec=0.05)
summary(res$opt.mod)