sprms: Sparse partial robust M regression
In sprm: Sparse and Non-Sparse Partial Robust M Regression and Classification

Description Usage Arguments Details Value Author(s) References See Also Examples

Sparse partial robust M regression for models with univariate response. This method for dimension reduction and regression analysis yields estimates with a partial least squares alike interpretability that are both sparse and robust to both vertical outliers and leverage points. The sparsity is tuned with an L1 penalty.

1
2
3

sprms(formula, data, a, eta, fun = "Hampel", probp1 = 0.95, hampelp2 = 0.975,
hampelp3 = 0.999, center = "median", scale = "qn", print = FALSE, 
numit = 100, prec = 0.01)

`formula`	an object of class formula.
`data`	a data frame which contains the variables given in formula or a list of two elements, where the first element is the response vector and the second element is a matrix of the explanatory variables.
`a`	the number of SPRMS components to be estimated in the model.
`eta`	a tuning parameter for the sparsity with 0\le eta<1.
`fun`	an internal weighting function for case weights. Choices are `"Hampel"` (preferred), `"Huber"` or `"Fair"`.
`probp1`	the 1-alpha value at which to set the first outlier cutoff for the weighting function.
`hampelp2`	the 1-alpha values for second cutoff. Only applies to `fun="Hampel"`.
`hampelp3`	the 1-alpha values for third cutoff. Only applies to `fun="Hampel"`.
`center`	type of centering of the data in form of a string that matches an R function, e.g. `"mean"` or `"median"`.
`scale`	type of scaling for the data in form of a string that matches an R function, e.g. `"sd"` or `"qn"` or alternatively `"no"` for no scaling.
`print`	logical, default is `FALSE`. If `TRUE` the variables included in each component are reported.
`numit`	the maximum number of iterations for the convergence of the coefficient estimates.
`prec`	a value for the precision of estimation of the coefficients.

The NIPLS algorithm with a L1 sparsity constrained combined with weighted regression is used for the model estimation.

a is the number of components in the model. Note that it is not possible to simply reduce the number of weighting vectors to obtain a model with a smaller number of components. Each model has to be estimated separately due to its dependence on robust case weights.

sprms returns an object of class sprm.

Functions summary, predict and plot are available. Also the generic functions coefficients, fitted.values and residuals can be used to extract the corresponding elements from the sprm object.

`coefficients`	vector of coefficients of the weighted regression model.
`intercept`	intercept of weighted regression model.
`wy`	the case weights in the y space.
`wt`	the case weights in the score space.
`w`	the overall case weights used for weighted regression (depending on the weight function). `w=wy*wt`.
`scores`	the matrix of scores.

`R`	Direction vectors (or weighting vectors or rotation matrix) to obtain the scores. `scores=Xs%*%R`.
`loadings`	the matrix of loadings.
`fitted.values`	the vector of estimated response values.
`residuals`	vector of residuals, true response minus estimated response.
`coefficients.scaled`	vector of coefficients of the weighted regression model with scaled data.
`intercept.scaled`	intercept of weighted regression model with scaled data.
`YMeans`	value used internally to center response.
`XMean`	vector used internally to center data.
`Xscales`	vector used internally to scale data.
`Yscales`	value used internally to scale response.
`Yvar`	percentage of contribution for each component to the explanation of the variance of the response.
`Xvar`	percentage of contribution for each component to the explanation of the variance of the variables.
`inputs`	list of inputs: parameters, data and scaled data.
`used.vars`	Indices of variables included in the model.

Sven Serneels, BASF Corp and Irene Hoffmann

Hoffmann, I., Serneels, S., Filzmoser, P., Croux, C. (2015). Sparse partial robust M regression. Chemometrics and Intelligent Laboratory Systems, 149, 50-59.

Serneels, S., Croux, C., Filzmoser, P., Van Espen, P.J. (2005). Partial Robust M-Regression. Chemometrics and Intelligent Laboratory Systems, 79, 55-64.

sprmsCV, plot.sprm, biplot.sprm, predict.sprm, prms

set.seed(50235)
U1 <- c(rep(3,20), rep(4,30))
U2 <- rep(3.5,50)
X1 <- replicate(5, U1+rnorm(50))
X2 <- replicate(20, U2+rnorm(50))
X <- cbind(X1,X2)
beta <- c(rep(1, 5), rep(0,20))
e <- c(rnorm(45,0,1.5),rnorm(5,-20,1))
y <- X%*%beta + e
d <- as.data.frame(X)
d$y <- y
mod <- sprms(y~., data=d, a=1, eta=0.5, fun="Hampel")
sprmfit <- predict(mod)

plot(y,sprmfit, main="SPRM")
abline(0,1)