R-log-Safe-Bayesian Ridge Regression

Description

The function SBRidgeRlog (R-log-Safe-Bayesian Ridge Regression) provides a Gibbs sampler together with the R-log-Safe-Bayesian algorithm for Ridge regression models with varying variance.

Usage

1
2
SBRidgeRlog(y, X = NULL, etaseq = 1, prior = NULL, nIter = 1100, burnIn = 100, 
	thin = 10, minAbsBeta = 1e-09, pIter = TRUE)

Arguments

y

Vector of outcome variables, numeric, NA allowed, length n.

X

Design matrix, numeric, dimension n x p, n >= 2.

etaseq

Vector of learning rates eta, numeric, 0 <= eta <= 1. Default 1.

prior

List containing the following elements

  • prior$varE: prior for the variance parameter σ^2 with parameters $df and $S for respectively degrees of freedom and scale parameters for an inverse-chi-square distribution. Default (0,0).

  • prior$varBR: prior for the variance of the Gaussian prior for the coefficients beta, with parameters $df and $S for respectively degrees of freedom and scale parameters for an inverse-chi-square distribution. Default (0,0).

nIter

Number of iterations, integer. Default 1100.

burnIn

Number of iterations for burn-in, integer. Default 100.

thin

Number of iterations for thinning, integer. Default 10.

minAbsBeta

Minimum absolute value of sampled coefficients beta to avoid numerical problems, numeric. Default 10^-9.

pIter

Print iterations, logical. Default TRUE.

Details

Details on generalized Bayesian regression can be found in (de Heide, 2016). The implementation of the Gibbs sampler is based on the BLR package of (de los Campos et al., 2009).

The Safe-Bayesian algorithm was proposed by Grunwald (2012) as a method to learn the learning rate for the generalized posterior to deal with model misspecification.

Value

$y

Vector of original outcome variables.

$mu

Posterior mean of the intercept.

$varE

Posterior mean of of the variance.

$yHat

Posterior mean of mu + X*beta + epsilon.

$SD.yHat

Corresponding standard deviation.

$whichNa

Vector with indices of missing values of y.

$fit$pD

Estimated number of effective parameters.

$fit$DIC

Deviance Information Criterion.

$bR

Posterior mean of beta.

$SD.bR

Corresponding standard deviation.

$prior

List containing the priors used.

$nIter

Number of iterations.

$burnIn

Number of iterations for burn-in.

$thin

Number of iterations for thinning.

$CMRlogEallen

List of cumulative posterior-expected posterior-randomized log-loss per eta.

$eta.min

Learning rate eta minimizing the cumulative posterior-expected posterior-randomized log-loss.

Author(s)

R. de Heide

References

de Heide, R. 2016. The Safe-Bayesian Lasso. Master Thesis, Leiden University.

de los Campos G., H. Naya, D. Gianola, J. Crossa, A. Legarra, E. Manfredi, K. Weigel and J. Cotes. 2009. Predicting Quantitative Traits with Regression Models for Dense Molecular Markers and Pedigree. Genetics 182: 375-385.

Grunwald, P.D. 2012. chapter The Safe Bayesian. Algorithmic Learning Theory: 23rd International Conference, ALT 2012, Lyon, France, October 29-31, 2012. Proceedings. 169-183. Springer Berlin Heidelberg

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
rm(list=ls())
# Simulate data
x <- runif(10, -1, 1) # 10 random uniform x's between -1 and 1
y <- NULL

# for each x, an y that is 0 + Gaussian noise
for (i in 1:10) {
  y[i] <-  0 + rnorm(1, mean=0, sd=1/4) 
  }

plot(x,y)

## Not run: 
# Let R-log-SafeBayes learn the learning rate 
sbobj <- SBRidgeRlog(y, x, etaseq=c(1, 0.5, 0.25))

# eta 
sbobj$eta.min
## End(Not run)