ridge | R Documentation |
The function ridge
fits linear models by ridge regression, returning
an object of class ridge
designed to be used with the plotting
methods in this package.
It is also designed to facilitate an alternative representation of the effects of shrinkage in the space of uncorrelated (PCA/SVD) components of the predictors.
The standard formulation of ridge regression is that it regularizes the estimates of coefficients
by adding small positive constants \lambda
to the diagonal elements of \mathbf{X}^\top\mathbf{X}
in
the least squares solution to achieve a more favorable tradeoff between bias and variance (inverse of precision)
of the coefficients.
\widehat{\mathbf{\beta}}^{\text{RR}}_k = (\mathbf{X}^\top \mathbf{X} + \lambda \mathbf{I})^{-1} \mathbf{X}^\top \mathbf{y}
Ridge regression shrinkage can be parameterized in several ways.
If a vector of lambda
values is supplied, these are used directly in the ridge regression computations.
Otherwise, if a vector df
can be supplied the equivalent values for effective degrees of freedom corresponding to shrinkage,
going down from the number of predictors in the model.
In either case, both lambda
and
df
are returned in the ridge
object, but the rownames
of the
coefficients are given in terms of lambda
.
coef
extracts the estimated coefficients for each value of the shrinkage factor
vcov
extracts the estimated p \times p
covariance matrices of the coefficients for each value of the shrinkage factor.
best
extracts the optimal shrinkage values according to several criteria:
HKB: Hoerl et al. (1975); LW: Lawless & Wang (1976); GCV: Golub et al. (1975)
ridge(y, ...)
## S3 method for class 'formula'
ridge(formula, data, lambda = 0, df, svd = TRUE, contrasts = NULL, ...)
## Default S3 method:
ridge(y, X, lambda = 0, df, svd = TRUE, ...)
## S3 method for class 'ridge'
coef(object, ...)
## S3 method for class 'ridge'
print(x, digits = max(5, getOption("digits") - 5), ...)
## S3 method for class 'ridge'
vcov(object, ...)
best(object, ...)
## S3 method for class 'ridge'
best(object, ...)
y |
A numeric vector containing the response variable. NAs not allowed. |
... |
Other arguments, passed down to methods |
formula |
For the |
data |
For the |
lambda |
A scalar or vector of ridge constants. A value of 0 corresponds to ordinary least squares. |
df |
A scalar or vector of effective degrees of freedom corresponding
to |
svd |
If |
contrasts |
a list of contrasts to be used for some or all of factor terms in the formula.
See the |
X |
A matrix of predictor variables. NA's not allowed. Should not include a column of 1's for the intercept. |
x , object |
An object of class |
digits |
For the |
If an intercept is present in the model, its coefficient is not penalized. (If you want to penalize an intercept, put in your own constant term and remove the intercept.)
The predictors are centered, but not (yet) scaled in this implementation.
A number of the methods in the package assume that lambda
is a vector of shrinkage constants
increasing from lambda[1] = 0
, or equivalently, a vector of df
decreasing from p
.
A list with the following components:
lambda |
The vector of ridge constants |
df |
The vector of effective degrees of freedom corresponding to |
coef |
The matrix of estimated ridge regression coefficients |
scales |
scalings used on the X matrix |
kHKB |
HKB estimate of the ridge constant |
kLW |
L-W estimate of the ridge constant |
GCV |
vector of GCV values |
kGCV |
value of |
criteria |
Collects the criteria |
If svd==TRUE
(the default), the following are also included:
svd.D |
Singular values of the |
svd.U |
Left singular vectors of the |
svd.V |
Right singular vectors of the |
A data.frame with one row for each of the HKB, LW, and GCV criteria
Michael Friendly
Hoerl, A. E., Kennard, R. W., and Baldwin, K. F. (1975), "Ridge Regression: Some Simulations," Communications in Statistics, 4, 105-123.
Lawless, J.F., and Wang, P. (1976), "A Simulation Study of Ridge and Other Regression Estimators," Communications in Statistics, 5, 307-323.
Golub G.H., Heath M., Wahba G. (1979) Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics, 21:215–223. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.2307/1268518")}
lm.ridge
for other implementations of ridge regression
traceplot
, plot.ridge
,
pairs.ridge
, plot3d.ridge
, for 1D, 2D, 3D plotting methods
pca.ridge
, biplot.ridge
,
biplot.pcaridge
for views in PCA/SVD space
precision.ridge
for measures of shrinkage and precision
#\donttest{
# Longley data, using number Employed as response
longley.y <- longley[, "Employed"]
longley.X <- data.matrix(longley[, c(2:6,1)])
lambda <- c(0, 0.005, 0.01, 0.02, 0.04, 0.08)
lridge <- ridge(longley.y, longley.X, lambda=lambda)
# same, using formula interface
lridge <- ridge(Employed ~ GNP + Unemployed + Armed.Forces + Population + Year + GNP.deflator,
data=longley, lambda=lambda)
coef(lridge)
# standard trace plot
traceplot(lridge)
# plot vs. equivalent df
traceplot(lridge, X="df")
pairs(lridge, radius=0.5)
#}
data(prostate)
py <- prostate[, "lpsa"]
pX <- data.matrix(prostate[, 1:8])
pridge <- ridge(py, pX, df=8:1)
pridge
plot(pridge)
pairs(pridge)
traceplot(pridge)
traceplot(pridge, X="df")
# Hospital manpower data from Table 3.8 of Myers (1990)
data(Manpower)
str(Manpower)
mmod <- lm(Hours ~ ., data=Manpower)
vif(mmod)
# ridge regression models, specified in terms of equivalent df
mridge <- ridge(Hours ~ ., data=Manpower, df=seq(5, 3.75, -.25))
vif(mridge)
# univariate ridge trace plots
traceplot(mridge)
traceplot(mridge, X="df")
# bivariate ridge trace plots
plot(mridge, radius=0.25, labels=mridge$df)
pairs(mridge, radius=0.25)
# 3D views
# ellipsoids for Load, Xray & BedDays are nearly 2D
plot3d(mridge, radius=0.2, labels=mridge$df)
# variables in model selected by AIC & BIC
plot3d(mridge, variables=c(2,3,5), radius=0.2, labels=mridge$df)
# plots in PCA/SVD space
mpridge <- pca(mridge)
traceplot(mpridge, X="df")
biplot(mpridge, radius=0.25)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.