bayeslm | R Documentation |
This package implements an efficient sampler for Gaussian Bayesian linear regression. The package uses elliptical slice sampler instead of regular Gibbs sampler. The function has several built-in priors and user can also provide their own prior function (written as a R function).
## Default S3 method: bayeslm(Y, X = FALSE, prior = "horseshoe", penalize = NULL, block_vec = NULL, sigma = NULL, s2 = 1, kap2 = 1, N = 20000L, burnin = 0L, thinning = 1L, vglobal = 1, sampling_vglobal = TRUE, verb = FALSE, icept = TRUE, standardize = TRUE, singular = FALSE, scale_sigma_prior = TRUE, prior_mean = NULL, prob_vec = NULL, cc = NULL, lambda = NULL, ...) ## S3 method for class 'formula' bayeslm(formula, data = list(), Y = FALSE, X = FALSE, prior = "horseshoe", penalize = NULL, block_vec = NULL, sigma = NULL, s2 = 1, kap2 = 1, N = 20000L, burnin = 0L, thinning = 1L, vglobal = 1, sampling_vglobal = TRUE, verb = FALSE, standardize = TRUE, singular = FALSE, scale_sigma_prior = TRUE, prior_mean = NULL, prob_vec = NULL, cc = NULL, lambda = NULL, ...)
formula |
|
data |
an optional data frame containing the variables in the model.
By default the variables are taken from the environment which
|
Y |
|
X |
|
prior |
Indicating shrinkage prior to use. |
block_vec |
A vector indicating number of regressors in each block. Sum of all entries should be the same as number of regressors. The default value is |
penalize |
A vector indicating shrink regressors or not. It's length should be the same as number of regressors. |
sigma |
Initial value of residual standard error. The default value is half of standard error of |
s2, kap2 |
Parameter of prior over sigma, an inverse gamma prior with rate s2 and shape s2. |
N |
Number of posterior samples (after burn-in). |
burnin |
Number of burn-in samples. If burnin > 0, the function will draw N + burnin samples and return the last N samples only. |
thinning |
Number of thinnings. |
vglobal |
Initial value of global shrinkage parameter. Default value is 1 |
sampling_vglobal |
|
verb |
Bool, if |
icept |
Bool, if the inputs are matrix |
standardize |
Bool, if |
singular |
Bool, if |
scale_sigma_prior |
Bool, if |
prior_mean |
|
prob_vec |
|
cc |
Only works when |
lambda |
The shrinkage parameter for Laplace prior only. |
... |
optional parameters to be passed to the low level function |
For details of the approach, please see Hahn, He and Lopes (2017)
loops |
A |
sigma |
A |
vglobal |
A |
beta |
A |
fitted.values |
Fitted values of the regression model. Take posterior mean of coefficients with 20% burnin samples. |
residuals |
Residuals of the regression model, equals |
horseshoe
is essentially call function bayeslm
with prior = "horseshoe"
. Same for sharkfin
, ridge
, blasso
, nonlocal
.
Jingyu He jingyu.he@chicagobooth.edu
Hahn, P. Richard, Jingyu He, and Hedibert Lopes. Efficient sampling for Gaussian linear regression with arbitrary priors. (2017).
p = 20 n = 100 kappa = 1.25 beta_true = c(c(1,2,3),rnorm(p-3,0,0.01)) sig_true = kappa*sqrt(sum(beta_true^2)) x = matrix(rnorm(p*n),n,p) y = x %*% beta_true + sig_true * rnorm(n) x = as.matrix(x) y = as.matrix(y) data = data.frame(x = x, y = y) block_vec = rep(1, p) # slice-within-Gibbs sampler, put every coefficient in its own block fitOLS = lm(y~x-1) # call the function using formulas fita = bayeslm(y ~ x, prior = 'horseshoe', block_vec = block_vec, N = 10000, burnin = 2000) # summary the results summary(fita) summary(fita$beta) # put the first two coefficients in one elliptical sampling block block_vec2 = c(2, rep(1, p-2)) fitb = bayeslm(y ~ x, data = data, prior = 'horseshoe', block_vec = block_vec2, N = 10000, burnin = 2000) # comparing several different priors fit1 = bayeslm(y,x,prior = 'horseshoe', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est1 = colMeans(fit1$beta) fit2 = bayeslm(y,x,prior = 'laplace', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est2 = colMeans(fit2$beta) fit3 = bayeslm(y,x,prior = 'ridge', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est3 = colMeans(fit3$beta) fit4 = bayeslm(y,x,prior = 'sharkfin', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est4 = colMeans(fit4$beta) fit5 = bayeslm(y,x,prior = 'nonlocal', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est5 = colMeans(fit5$beta) plot(NULL,xlim=range(beta_true),ylim=range(beta_true), xlab = "beta true", ylab = "estimation") points(beta_true,beta_est1,pch=20) points(beta_true,fitOLS$coef,col='red') points(beta_true,beta_est2,pch=20,col='cyan') points(beta_true,beta_est3,pch=20,col='orange') points(beta_true,beta_est4,pch=20,col='pink') points(beta_true,beta_est5,pch=20,col='lightgreen') legend("topleft", c("OLS", "horseshoe", "laplace", "ridge", "sharkfin", "nonlocal"), col = c("red", "black", "cyan", "orange", "pink", "lightgreen"), pch = rep(1, 6)) abline(0,1,col='red') rmseOLS = sqrt(sum((fitOLS$coef-beta_true)^2)) rmse1 = sqrt(sum((beta_est1-beta_true)^2)) rmse2 = sqrt(sum((beta_est2-beta_true)^2)) rmse3 = sqrt(sum((beta_est3-beta_true)^2)) rmse4 = sqrt(sum((beta_est4-beta_true)^2)) rmse5 = sqrt(sum((beta_est5-beta_true)^2)) print(cbind(ols = rmseOLS, hs = rmse1,laplace = rmse2, ridge = rmse3,sharkfin = rmse4,nonlocal = rmse5))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.