blasso | R Documentation |
Inference for ordinary least squares, lasso/NG, horseshoe and ridge regression models by (Gibbs) sampling from the Bayesian posterior distribution, augmented with Reversible Jump for model selection
bhs(X, y, T=1000, thin=NULL, RJ=TRUE, M=NULL, beta=NULL,
lambda2=1, s2=var(y-mean(y)), mprior=0, ab=NULL,
theta=0, rao.s2=TRUE, icept=TRUE, normalize=TRUE, verb=1)
bridge(X, y, T = 1000, thin = NULL, RJ = TRUE, M = NULL,
beta = NULL, lambda2 = 1, s2 = var(y-mean(y)), mprior = 0,
rd = NULL, ab = NULL, theta=0, rao.s2 = TRUE, icept = TRUE,
normalize = TRUE, verb = 1)
blasso(X, y, T = 1000, thin = NULL, RJ = TRUE, M = NULL,
beta = NULL, lambda2 = 1, s2 = var(y-mean(y)),
case = c("default", "ridge", "hs", "ng"), mprior = 0, rd = NULL,
ab = NULL, theta = 0, rao.s2 = TRUE, icept = TRUE,
normalize = TRUE, verb = 1)
X |
|
y |
vector of output responses |
T |
total number of MCMC samples to be collected |
thin |
number of MCMC samples to skip before a sample is
collected (via thinning). If |
RJ |
if |
M |
the maximal number of allowed covariates (columns of
|
beta |
initial setting of the regression coefficients. Any
zero-components will imply that the corresponding covariate (column
of |
lambda2 |
square of the initial lasso penalty parameter. If zero, then least squares regressions are used |
s2 |
initial variance parameter |
case |
specifies if ridge regression, the
Normal-Gamma, or the horseshoe prior should be done instead
of the lasso; only meaningful when |
mprior |
prior on the number of non-zero regression coefficients
(and therefore covariates) |
rd |
|
ab |
|
theta |
the rate parameter ( |
rao.s2 |
indicates whether Rao-Blackwellized samples for
|
icept |
if |
normalize |
if |
verb |
verbosity level; currently only |
The Bayesian lasso model and Gibbs Sampling algorithm is described
in detail in Park & Casella (2008). The algorithm implemented
by this function is identical to that described therein, with
the exception of an added “option” to use a Rao-Blackwellized
sample of \sigma^2
(with \beta
integrated out)
for improved mixing, and the model selections by RJ described below.
When input argument lambda2 = 0
is
supplied, the model is a simple hierarchical linear model where
(\beta,\sigma^2)
is given a Jeffrey's prior
Specifying RJ = TRUE
causes Bayesian model selection and
averaging to commence for choosing which of the columns of the
design matrix X
(and thus parameters beta
) should be
included in the model. The zero-components of the beta
input
specify which columns are in the initial model, and
M
specifies the maximal number of columns.
The RJ mechanism implemented here for the Bayesian lasso model
selection differs from the one described by Hans (2009),
which is based on an idea from Geweke (1996).
Those methods require departing from the Park & Casella
(2008) latent-variable model and requires sampling from each conditional
\beta_i | \beta_{(-i)}, \dots
for all
i
, since a mixture prior with a point-mass at zero is
placed on each \beta_i
. Out implementation
here requires no such special prior and retains the joint sampling
from the full \beta
vector of non-zero entries, which
we believe yields better mixing in the Markov chain. RJ
proposals to increase/decrease the number of non-zero entries
does proceed component-wise, but the acceptance rates are high due
due to marginalized between-model moves (Troughton & Godsill, 1997).
When the lasso prior or RJ is used, the automatic thinning level
(unless thin != NULL
) is determined by the number of columns
of X
since this many latent variables are introduced
Bayesian ridge regression is implemented as a special case via the
bridge
function. This essentially calls blasso
with case = "ridge"
. A default setting of rd = c(0,0)
is
implied by rd = NULL
, giving the Jeffery's prior for the
penalty parameter \lambda^2
unless ncol(X) >=
length(y)
in which case the proper specification of rd =
c(5,10)
is used instead.
The Normal–Gamma prior (Griffin & Brown, 2009) is implemented as
an extension to the Bayesian lasso with case = "ng"
. Many
thanks to James Scott for providing the code needed to extend the
method(s) to use the horseshoe prior (Carvalho, Polson, Scott, 2010).
When theta > 0
then the Student-t errors via scale mixtures
(and thereby extra latent variables omega2
) of Geweke (1993)
is applied as an extension to the Bayesian lasso/ridge model.
If Student-t errors are used the automatic thinning level
is augmented (unless thin != NULL
) by the number of rows
in X
since this many latent variables are introduced
blasso
returns an object of class "blasso"
, which is a
list
containing a copy of all of the input arguments as well as
of the components listed below.
call |
a copy of the function call as used |
mu |
a vector of |
beta |
a |
m |
the number of non-zero entries in each vector of |
s2 |
a vector of |
lambda2 |
a vector of |
gamma |
a vector of |
tau2i |
a |
omega2 |
a |
nu |
a vector of |
pi |
a vector of |
lpost |
the log posterior probability of each (saved) sample of the joint parameters |
llik |
the log likelihood of each (saved) sample of the parameters |
llik.norm |
the log likelihood of each (saved) sample of the
parameters under the Normal errors model when sampling under the
Student-t model; i.e., it is not present
unless |
Whenever ncol(X) >= nrow(X)
it must be that either RJ = TRUE
with M <= nrow(X)-1
(the default) or that the lasso is turned
on with lambda2 > 0
. Otherwise the regression problem is ill-posed.
Since the starting values are considered to be first sample (of
T
), the total number of (new) samples obtained by Gibbs
Sampling will be T-1
Robert B. Gramacy rbg@vt.edu
Park, T., Casella, G. (2008). The Bayesian Lasso.
Journal of the American Statistical Association, 103(482),
June 2008, pp. 681-686
\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1198/016214508000000337")}
Griffin, J.E. and Brown, P.J. (2009).
Inference with Normal-Gamma prior distributions in
regression problems. Bayesian Analysis, 5, pp. 171-188.
\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/10-BA507")}
Hans, C. (2009). Bayesian Lasso regression.
Biometrika 96, pp. 835-845.
\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/biomet/asp047")}
Carvalho, C.M., Polson, N.G., and Scott, J.G. (2010) The
horseshoe estimator for sparse signals. Biometrika 97(2):
pp. 465-480.
https://faculty.chicagobooth.edu/nicholas.polson/research/papers/Horse.pdf
Geweke, J. (1996). Variable selection and model comparison in regression. In Bayesian Statistics 5. Editors: J.M. Bernardo, J.O. Berger, A.P. Dawid and A.F.M. Smith, 609-620. Oxford Press.
Paul T. Troughton and Simon J. Godsill (1997). A reversible jump sampler for autoregressive time series, employing full conditionals to achieve efficient model space moves. Technical Report CUED/F-INFENG/TR.304, Cambridge University Engineering Department.
Geweke, J. (1993) Bayesian treatment of the independent Student-t linear model. Journal of Applied Econometrics, Vol. 8, S19-S40
https://bobby.gramacy.com/r_packages/monomvn/
lm
,
lars
in the lars package,
regress
,
lm.ridge
in the MASS package
## following the lars diabetes example
data(diabetes)
attach(diabetes)
## Ordinary Least Squares regression
reg.ols <- regress(x, y)
## Lasso regression
reg.las <- regress(x, y, method="lasso")
## Bayesian Lasso regression
reg.blas <- blasso(x, y)
## summarize the beta (regression coefficients) estimates
plot(reg.blas, burnin=200)
points(drop(reg.las$b), col=2, pch=20)
points(drop(reg.ols$b), col=3, pch=18)
legend("topleft", c("blasso-map", "lasso", "lsr"),
col=c(2,2,3), pch=c(21,20,18))
## plot the size of different models visited
plot(reg.blas, burnin=200, which="m")
## get the summary
s <- summary(reg.blas, burnin=200)
## calculate the probability that each beta coef != zero
s$bn0
## summarize s2
plot(reg.blas, burnin=200, which="s2")
s$s2
## summarize lambda2
plot(reg.blas, burnin=200, which="lambda2")
s$lambda2
## Not run:
## fit with Student-t errors
## (~400-times slower due to automatic thinning level)
regt.blas <- blasso(x, y, theta=0.1)
## plotting some information about nu, and quantiles
plot(regt.blas, "nu", burnin=200)
quantile(regt.blas$nu[-(1:200)], c(0.05, 0.95))
## Bayes Factor shows strong evidence for Student-t model
mean(exp(regt.blas$llik[-(1:200)] - regt.blas$llik.norm[-(1:200)]))
## End(Not run)
## clean up
detach(diabetes)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.