rindependent_norm_gamma_reg: The Bayesian Indendepent Normal-Gamma Regression Distribution
In knygren/glmbayes: Bayesian Generalized Linear Models (iid Samples)

Description Usage Arguments Details Value See Also Examples

View source: R/rIndependent_Normal_Gamma_reg.R

Generates iid samples from the posterior density for the independent normal-gamma regression

rindependent_norm_gamma_reg(
  n,
  y,
  x,
  prior_list,
  offset = NULL,
  weights = 1,
  family = gaussian(),
  max_disp_perc = 0.99
)

rindependent_norm_gamma_reg_v3(
  n,
  y,
  x,
  prior_list,
  offset = NULL,
  weights = 1,
  family = gaussian(),
  max_disp_perc = 0.99
)

rindependent_norm_gamma_reg_v2(
  n,
  y,
  x,
  prior_list,
  offset = NULL,
  weights = 1,
  family = gaussian(),
  max_disp_perc = 0.99
)

rindependent_norm_gamma_reg_v4(
  n,
  y,
  x,
  prior_list,
  offset = NULL,
  weights = 1,
  family = gaussian(),
  max_disp_perc = 0.99
)

`n`	number of draws to generate. If `length(n) > 1`, the length is taken to be the number required.
`y`	For `glm`: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For `glm.fit`: `x` is a design matrix of dimension `n * p`, and `y` is a vector of observations of length `n`.
`x`	For `glm`: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For `glm.fit`: `x` is a design matrix of dimension `n * p`, and `y` is a vector of observations of length `n`.
`prior_list`	a list with the prior parameters (mu, Sigma, shape and rate) for the prior distribution.
`offset`	an offset parameter
`weights`	a weighting variable
`family`	a description of the error distribution and link function to be used in the model. For `glm` this can be a character string naming a family function, a family function or the result of a call to a family function. For `glm.fit` only the third option is supported. (See `family` for details of family functions.)
`max_disp_perc`	parameter currently used to control upper bound in accept-reject procedure

The rindependent_norm_gamma_reg function produces iid samples for the Bayesian the Bayesian Independent Normal-Gamma Regression Distribution associated with the gaussian() family.

As this is a non-conjugate prior specification with a random dispersion parameter, we leverage methods that build on those developed for log-concave likelihood functions with a fixed dispersion parameter. A number of steps are first used in order to position an initial enveloping function for the main regression coefficients near the center of the posterior density and simultaneoulsy finding a modified Gamma distribution from which to generate candidate samples for the inverse dispersion.

During the simulation procedure, candidates are generated independently from the modified gamma distribution and a specific likelihood subgradient density (corresponding to the center of the density). A series of bounding functions are then used during the simulation process to determine if specific candidates should be accepted as follows

1) Similar to the procedure in the fixed dispersion case, the (modified) log-likelihood function is bounded using the value of the log-likelihood function at a specific point (which now depends on the dispersion and is picked to ensure the gradient vector is constant across simulated dispersion candidates) and the gradient at that value (which is kept fixed for each component in the grid regardless of dispersion.

2) A portion of the bounding function above is in turn bounded by using a term that has been shifted to rate parameter of the modified gamma candidate generating distribution.

3) Because candidates

matrix and a prior specification. The function returns the simulated Bayesian coefficients and some associated outputs. The iid samples from the posterior density is genererated using standard simulation procedures for multivariate normal and gamma distributions.

rindependent_norm_gamma_reg returns a object of class "rglmb". The function summary (i.e., summary.rglmb) can be used to obtain or print a summary of the results. The generic accessor functions coefficients, fitted.values, residuals, and extractAIC can be used to extract various useful features of the value returned by rNormal_Gamma_reg. An object of class "rglmb" is a list containing at least the following components:

`coefficients`	a `n` by `length(mu)` matrix with one sample in each row
`PostMode`	a vector of `length(mu)` with the estimated posterior mode coefficients
`Prior`	A list with two components. The first being the prior mean vector and the second the prior precision matrix
`iters`	an `n` by `1` matrix giving the number of candidates generated before acceptance for each sample.
`famfunc`	an object of class `"famfunc"`
`Envelope`	an object of class `"envelope"`
`dispersion`	an `n` by `1` matrix with simulated values for the dispersion
`loglike`	a `n` by `1` matrix containing the negative loglikelihood for each sample.

Other family simulation functions: rGamma_reg(), rNormal_Gamma_reg(), rNormal_reg()

## Annette Dobson (1990) "An Introduction to Generalized Linear Models".
## Page 9: Plant Weight Data.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group,y=TRUE,x=TRUE)

p_setup=Prior_Setup(lm.D9)
dispersion=sigma(lm.D9)^2

## Initialize dispersion
p=1/dispersion

mu=p_setup$mu
Sigma_prior=p_setup$Sigma
y=lm.D9$y
x=lm.D9$x

Prior_Check(weight ~ group,gaussian(),dNormal(mu=mu,Sigma=Sigma_prior))

## This simple "standardization" for the prior appears to 
## correct the issue with the prior

## "Standardize" the prior for the intercept by setting the
## mean equal to the mean of the dependent variable
## and setting the variance equal to the variance of the dependent variable

mu[1,1]=mean(y)
Sigma_prior[1,1]=var(y)

## For factors, set the "standad prior to mean=0 and variance driven by the range
## of the dependent variable 

mu[2,1]=0
Sigma_prior[2,2]=((max(y)-min(y))/1.96)^2

#Prior_Check(lm.D9,mu,Sigma_prior)
Prior_Check(weight ~ group,gaussian(),dNormal(mu=mu,Sigma=Sigma_prior))


#prior=list(mu=mu,Sigma=Sigma_prior,dispersion=dispersion)
glmb.D9=glmb(weight~group, family=gaussian(),dNormal(mu=mu,Sigma=Sigma_prior,dispersion=dispersion))
post_mode=glmb.D9$coef.mode

sum_out1=summary(glmb.D9)

## Try mean one standard deviation away to see how it works
#mu[1,1]=mu[1,1]-2*sum_out1$coefficients[1,3]
#mu[2,1]=mu[2,1]+2*sum_out1$coefficients[1,3]

#prior=list(mu=mu,Sigma=Sigma_prior,dispersion=dispersion)

# Temporarily lower the prior variance
Sigma_prior=1*Sigma_prior

glmb.D9_v2=glmb(n=1000,weight~group, family=gaussian(),
dNormal(mu=mu,Sigma=Sigma_prior,dispersion=dispersion))

n_prior=2
shape=n_prior/2
rate= dispersion*shape


summary(glmb.D9)
summary(glmb.D9_v2)


#lm_out=lm(y ~ x-1) # returns same model
RSS=sum(residuals(lm.D9)^2)

n_prior=4
n_data=length(y)
shape=(n_prior/2)
rate=n_prior*RSS/n_data



prior_list=list(mu=mu,Sigma=Sigma_prior,dispersion=dispersion,
                 shape=shape,rate=rate,Precision=solve(Sigma_prior))


set.seed(360)

 ptm <- proc.time()
 sim2=rindependent_norm_gamma_reg(n=1000,y,x,prior_list=prior_list,
offset=NULL,weights=1,max_disp_perc=0.99)
 proc.time()-ptm