knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Solving for the ridge regression coefficients depends on the value of the tuning parameter $\lambda$. Since we cannot directly assess how good of a model is fitted for a given value of $\lambda$, we can instead look at how the mean squared error (MSE) of models trained and tested on separate sets vary.
The data sets used will be ridge_train
for training the models and ridge_test
for testing them. Each table consists of one response variable and four covariates, with 200 rows each.
library(bis557) data(ridge_train) data(ridge_test)
I chose to fit a model for integer values of $\lambda$ from 15 to 100, inclusive. This was accomplished by running the ridge_reg()
function in parallel using the foreach
command.
lambda <- seq(15, 150, by = 1) n <- nrow(ridge_test) form <- y ~ . - 1 y <- ridge_test[,1] X <- ridge_test[,-1] library(foreach) library(doParallel) cl <- parallel::makeForkCluster(2) doParallel::registerDoParallel(cl) mse <- foreach (l = lambda, .combine = "c") %dopar% { beta_hat <- ridge_reg(form, l, ridge_train)$coef yhat <- as.matrix(X) %*% beta_hat err <- sum((y - yhat)^2) / n return(err) } parallel::stopCluster(cl)
The mean squared error of the fitted model varies with $\lambda$ as follows:
plot(lambda, mse, main = "Selecting Lambda via Cross-Validation", log = "x", xlab = "Value of lambda", ylab = "MSE")
In the plot above, we see that there exists some value $\lambda^*$ that appears to minimize the out-of-sample MSE of the fitted model. We can approximate this value with $\hat{\lambda}$:
lambda_hat <- lambda[which(mse == min(mse))] lambda_hat
If we zoom in, this trend holds and we can also obtain a more precise estimate for $\lambda^*$
lambda <- seq(28, 35, by = 0.05) cl <- parallel::makeForkCluster(2) doParallel::registerDoParallel(cl) mse <- foreach (l = lambda, .combine = "c") %dopar% { beta_hat <- ridge_reg(form, l, ridge_train)$coef yhat <- as.matrix(X) %*% beta_hat err <- sum((y - yhat)^2) / n return(err) } parallel::stopCluster(cl) plot(lambda, mse, main = "Selecting Lambda via Cross-Validation", log = "x", xlab = "Value of lambda", ylab = "MSE") lambda_hat <- lambda[which(mse == min(mse))] lambda_hat
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.