solveEN: Coordinate Descent algorithm to solve the Elastic-Net-type...
In MarcooLopez/SFSI_data: Sparse Family and Selection Index

Description Usage Arguments Details Value Author(s) References Examples

View source: R/solveEN.R

Computes the entire Elastic-Net solution for the regression coefficients simultaneously for all values of the penalization parameter using as inputs a variance matrix among predictors and a covariance vector between response and predictors, via the Coordinate Descent (CD) algorithm (Friedman, 2007).

1 2	solveEN(P, cov, alpha = 1, lambda = NULL, nLambda = 100, scale = TRUE, tol = 1e-05, maxIter = 1000, lowerTri = FALSE, verbose = FALSE)

`P`	Variance-covariance matrix among predictors. It can be a lower triangular matrix provided `lowerTri=TRUE`
`cov`	Covariance vector between response variable and predictors
`lambda`	Penalization parameter sequence vector. Default is `lambda=NULL`, in this case a decreasing grid of `'nLambda'` lambdas will be generated starting from a maximum equal to max(abs(cov)/alpha) to a minimum equal to zero. If `alpha=0` the grid is generated starting from a maximum equal to 5
`nLambda`	Number of lambdas generated when `lambda=NULL`
`alpha`	Numeric between 0 and 1 indicating the weights given to the L1 and L2-penalties
`scale`	`TRUE` or `FALSE` to recalculate the matrix `P` for variables with unit variance and scale `cov` by the standard deviation of the corresponding predictor taken from the diagonal of `P`
`tol`	Maximum error between two consecutive solutions of the iterative algorithm to declare convergence
`maxIter`	Maximum number of iterations to run at each lambda step before convergence is reached
`lowerTri`	If `TRUE` only the lower triangular part of `P` is considered given that is symmetric
`verbose`	`TRUE` or `FALSE` to whether printing each CD step

Finds solutions for the regression coefficients in a linear model

y_i = x'_i β + e_i

where y_i is the response for the i^th observation, x_i=(x_i1,...,x_ip)' is a vector of p predictors assumed to have unit variance, β=(β₁,...,β_p)' is a vector of regression coefficients, and e_i is a residual.

The regression coefficients β are estimated as function of the variance matrix among predictors (P) and the covariance vector between response and predictors (cov) by minimizing the penalized mean squared error function

-cov' β + 1/2 β' P β + λ J(β)

where λ is the penalization parameter and J(β) is a penalty function given by

1/2(1-α)||β||₂² + α||β||₁

where 0 ≤ α ≤ 1, and ||β||₁ = ∑|β_j| and ||β||₂² = ∑β_j² are the L1 and (squared) L2-norms, respectively.

The "partial residual" excluding the contribution of the predictor x_ij is

e_i^(j) = y_i - x'_i β + x_ijβ_j

then the ordinary least-squares (OLS) coefficient of x_ij on this residual is (up-to a constant)

β_j^(ols) = cov_j - P'_j β + β_j

where cov_j is the j^th element of cov and P_j is the j^th column of the matrix P.

Coefficients are updated for each j=1,...,p from their current value β_j to a new value β_j(α,λ), given α and λ, by "soft-thresholding" their OLS estimate until convergence as fully described in Friedman (2007).

List object containing the elements:

beta: vector of regression coefficients.
lambda: sequence of values of lambda used
df: degrees of freedom, number of non-zero predictors at each solution.
sdx: vector of standard deviation of predictors.

The returned object is of the class 'LASSO' for which methods fitted exist. Function plotPath can be also used

Marco Lopez-Cruz (lopezcru@msu.edu) and Gustavo de los Campos

Friedman J, Hastie T, Höfling H, Tibshirani R (2007). Pathwise coordinate optimization. The Annals of Applied Statistics, 1(2), 302–332.

Hoerl AE, Kennard RW (1970). Ridge Regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67.

Tibshirani R (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society B, 58(1), 267–288.

Zou H, Hastie T (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society B, 67(2), 301–320.

  require(SFSI)
  data(wheatHTP)
  y = scale(Y[,"YLD"])              # Response variable
  X = scale(WL)                     # Reflectance data

  # Training and testing sets
  tst = sample(seq_along(y),ceiling(0.3*length(y)))
  trn = seq_along(y)[-tst]

  # Calculate covariances in training set
  XtX = var(X[trn,])
  Xty = cov(y[trn],X[trn,])
  
  # Run an Elastic-Net regression
  fm = solveEN(XtX,Xty,alpha=0.5) 
  
  # Predicted values
  yHat1 = fitted(fm, X=X[trn,])  # training data
  yHat2 = fitted(fm, X=X[tst,])  # testing data
  
  # Penalization vs correlation
  plot(-log(fm$lambda),cor(y[trn],yHat1)[1,], main="training")
  plot(-log(fm$lambda),cor(y[tst],yHat2)[1,], main="testing")