Description Usage Arguments Details Value Author(s) References Examples
Computes the entire Elastic-Net solution for the regression coefficients simultaneously for all values of the penalization parameter using as inputs a variance matrix among predictors and a covariance vector between response and predictors, via the Coordinate Descent (CD) algorithm (Friedman, 2007).
1 2 |
P |
Variance-covariance matrix among predictors. It can be a lower triangular matrix provided |
cov |
Covariance vector between response variable and predictors |
lambda |
Penalization parameter sequence vector. Default is alpha=0 the grid is generated starting from a maximum equal to 5
|
nLambda |
Number of lambdas generated when |
alpha |
Numeric between 0 and 1 indicating the weights given to the L1 and L2-penalties |
scale |
|
tol |
Maximum error between two consecutive solutions of the iterative algorithm to declare convergence |
maxIter |
Maximum number of iterations to run at each lambda step before convergence is reached |
lowerTri |
If |
verbose |
|
Finds solutions for the regression coefficients in a linear model
where yi is the response for the ith observation, xi=(xi1,...,xip)' is a vector of p predictors assumed to have unit variance, β=(β1,...,βp)' is a vector of regression coefficients, and ei is a residual.
The regression coefficients β are estimated as function of the variance matrix among predictors (P) and the covariance vector between response and predictors (cov) by minimizing the penalized mean squared error function
where λ is the penalization parameter and J(β) is a penalty function given by
where 0 ≤ α ≤ 1, and ||β||1 = ∑|βj| and ||β||22 = ∑βj2 are the L1 and (squared) L2-norms, respectively.
The "partial residual" excluding the contribution of the predictor xij is
then the ordinary least-squares (OLS) coefficient of xij on this residual is (up-to a constant)
where covj is the jth element of cov and Pj is the jth column of the matrix P.
Coefficients are updated for each j=1,...,p from their current value βj to a new value βj(α,λ), given α and λ, by "soft-thresholding" their OLS estimate until convergence as fully described in Friedman (2007).
List object containing the elements:
beta
: vector of regression coefficients.
lambda
: sequence of values of lambda used
df
: degrees of freedom, number of non-zero predictors at each solution.
sdx
: vector of standard deviation of predictors.
The returned object is of the class 'LASSO' for which methods fitted
exist. Function plotPath
can be also used
Marco Lopez-Cruz (lopezcru@msu.edu) and Gustavo de los Campos
Friedman J, Hastie T, Höfling H, Tibshirani R (2007). Pathwise coordinate optimization. The Annals of Applied Statistics, 1(2), 302–332.
Hoerl AE, Kennard RW (1970). Ridge Regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67.
Tibshirani R (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society B, 58(1), 267–288.
Zou H, Hastie T (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society B, 67(2), 301–320.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | require(SFSI)
data(wheatHTP)
y = scale(Y[,"YLD"]) # Response variable
X = scale(WL) # Reflectance data
# Training and testing sets
tst = sample(seq_along(y),ceiling(0.3*length(y)))
trn = seq_along(y)[-tst]
# Calculate covariances in training set
XtX = var(X[trn,])
Xty = cov(y[trn],X[trn,])
# Run an Elastic-Net regression
fm = solveEN(XtX,Xty,alpha=0.5)
# Predicted values
yHat1 = fitted(fm, X=X[trn,]) # training data
yHat2 = fitted(fm, X=X[tst,]) # testing data
# Penalization vs correlation
plot(-log(fm$lambda),cor(y[trn],yHat1)[1,], main="training")
plot(-log(fm$lambda),cor(y[tst],yHat2)[1,], main="testing")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.