rr: Linear Ridge Regression

View source: R/rr.R

rrR Documentation

Linear Ridge Regression

Description

Function rr fits linear ridge regression models (Hoerl & Kennard 1970, Hastie & Tibshirani 2004, Hastie et al 2009, Cule & De Iorio 2012) by eigen decomposition. The "kernel cross-product trick" (such as in pca_eigenk; Wu et al. 1997) is used when n < p.

Data are internally centered (column-wise) before the analyses, but they are not scaled (there is no argument scale in the function). If needed, the user has to do the scaling before using the function.

Row observations can eventually be weighted with a priori weights (using argument weights).

Usage


rr(Xr, Yr, Xu, Yu = NULL, lambda = 0, unit = 1, weights = NULL)

Arguments

Xr

A n x p matrix or data frame of reference (= training) observations.

Yr

A n x q matrix or data frame, or a vector of length n, of reference (= training) responses.

Xu

A m x p matrix or data frame of new (= test) observations to predict.

Yu

A m x q matrix or data frame, or a vector of length m, of the true responses for Xu. Default to NULL.

lambda

A value, or vector of values, of the regularization parameter lambda.

unit

A saclar. Unit used for lambda (Default to unit = 1). For instance, lambda = 12, unit = 1e-6, ... means that lambda = 12e-6.

weights

A vector of length n defining a priori weights to apply to the observations. Internally, weights are "normalized" to sum to 1. Default to NULL (weights are set to 1 / n).

Value

A list of outputs (see examples), such as:

y

Responses for the test data.

fit

Predictions for the test data.

r

Residuals for the test data.

b

The b-coefficients (including intercept).

tr

The trace of the hat matrix (estimated df).

References

Cule, E., De Iorio, M., 2012. A semi-automatic method to guide the choice of ridge parameter in ridge regression. arXiv:1205.0686.

Hastie, T., Tibshirani, R., 2004. Efficient quadratic regularization for expression arrays. Biostatistics 5, 329-340. https://doi.org/10.1093/biostatistics/kxh010

Hastie, T., Tibshirani, R., Friedman, J., 2009. The elements of statistical learning: data mining, inference, and prediction, 2nd ed. Springer, New York.

Hoerl, A.E., Kennard, R.W., 1970. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 12, 55-67. https://doi.org/10.1080/00401706.1970.10488634

Wu, W., Massart, D.L., de Jong, S., 1997. The kernel PCA algorithms for wide data. Part I: Theory and algorithms. Chemometrics and Intelligent Laboratory Systems 36, 165-172. https://doi.org/10.1016/S0169-7439(97)00010-5

Examples


n <- 10
p <- 6
set.seed(1)
X <- matrix(rnorm(n * p, mean = 10), ncol = p, byrow = TRUE)
y1 <- 100 * rnorm(n)
y2 <- 100 * rnorm(n)
Y <- cbind(y1, y2)
set.seed(NULL)

Xr <- X[1:8, ] ; Yr <- Y[1:8, ] 
Xu <- X[9:10, ] ; Yu <- Y[9:10, ] 

fm <- rr(Xr, Yr, Xu, Yu, lambda = c(.1, .2))
## Same as:
## fm <- rr(Xr, Yr, Xu, Yu, lambda = c(1, 2), unit = .1)

fm$y
fm$fit
fm$r

mse(fm, ~ lambda)
mse(fm, ~ lambda, nam = "y2")


mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.