In this vignette, we show how to use bayeslm
to sample coefficients for a
Gaussian linear regression with a number of different prior distributions.
knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(bayeslm)
set.seed(200) p = 20 n = 100 kappa = 1.25 beta_true = c(c(1,2,3),rnorm(p-3,0,0.01)) sig_true = kappa*sqrt(sum(beta_true^2)) x = matrix(rnorm(p*n),n,p) y = x %*% beta_true + sig_true * rnorm(n) x = as.matrix(x) y = as.matrix(y) data = data.frame(x = x, y = y)
First, we run OLS and inspect the estimated coefficients
fitOLS = lm(y~x-1) coef(fitOLS)
bayeslm
The bayeslm sampler can group coefficients into blocks and sample several coefficients at once. Below, we simply place every coefficient into its own block.
block_vec = rep(1, p)
Now, we run bayeslm
on six different priors and store their estimated coefficients
# Horseshoe prior fit1 = bayeslm(y, x, prior = 'horseshoe', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est1 = colMeans(fit1$beta) # Laplace prior fit2 = bayeslm(y, x, prior = 'laplace', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est2 = colMeans(fit2$beta) # Ridge prior fit3 = bayeslm(y, x, prior = 'ridge', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est3 = colMeans(fit3$beta) # "Sharkfin" prior fit4 = bayeslm(y, x, prior = 'sharkfin', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est4 = colMeans(fit4$beta) # "Non-local" prior fit5 = bayeslm(y, x, prior = 'nonlocal', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est5 = colMeans(fit5$beta) # Inverse laplace prior fit6 = bayeslm(y, x, prior = 'inverselaplace', lambda = 0.01, icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est6 = colMeans(fit6$beta)
And we plot the posterior distribution of the regression coefficients, along with the OLS estimates, against the true simulated coefficients.
plot(NULL,xlim=range(beta_true),ylim=range(beta_true), xlab = "beta true", ylab = "estimation", ) points(beta_true,beta_est1,pch=20) points(beta_true,fitOLS$coef,col='red') points(beta_true,beta_est2,pch=20,col='cyan') points(beta_true,beta_est3,pch=20,col='orange') points(beta_true,beta_est4,pch=20,col='pink') points(beta_true,beta_est5,pch=20,col='lightgreen') points(beta_true,beta_est6,pch=20,col='grey') legend("topleft", c("OLS", "horseshoe", "laplace", "ridge", "sharkfin", "nonlocal", "inverselaplace"), col = c("red", "black", "cyan", "orange", "pink", "lightgreen", "grey"), pch = rep(1, 7)) abline(0,1,col='red')
We can also compare the root mean squared error (RMSE) for each prior
rmseOLS = sqrt(sum((fitOLS$coef-beta_true)^2)) rmse1 = sqrt(sum((beta_est1-beta_true)^2)) rmse2 = sqrt(sum((beta_est2-beta_true)^2)) rmse3 = sqrt(sum((beta_est3-beta_true)^2)) rmse4 = sqrt(sum((beta_est4-beta_true)^2)) rmse5 = sqrt(sum((beta_est5-beta_true)^2)) rmse6 = sqrt(sum((beta_est6-beta_true)^2)) print(cbind(ols = rmseOLS, hs = rmse1,laplace = rmse2, ridge = rmse3, sharkfin = rmse4,nonlocal = rmse5, inverselaplace = rmse6))
Here, we demonstrate:
y ~ x
) in the bayeslm
library# Put the first two coefficients in one elliptical sampling block block_vec2 = c(2, rep(1, p-2)) fitb = bayeslm(y ~ x - 1, data = data, prior = 'horseshoe', block_vec = block_vec2, N = 10000, burnin = 2000) summary(fitb)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.