glmb: Fitting Bayesian Generalized Linear Models
In knygren/glmbayes: Bayesian Generalized Linear Models (iid Samples)

Description Usage Arguments Details Value Author(s) References See Also Examples

glmb is used to fit Bayesian generalized linear models, specified by giving a symbolic descriptions of the linear predictor, the error distribution, and the prior distribution.

glmb(
  formula,
  family = binomial,
  pfamily = dNormal(mu, Sigma, dispersion = 1),
  n = 1000,
  data,
  weights,
  subset,
  offset,
  na.action,
  Gridtype = 1,
  start = NULL,
  etastart,
  mustart,
  control = list(...),
  model = TRUE,
  method = "glm.fit",
  x = FALSE,
  y = TRUE,
  contrasts = NULL,
  ...
)

## S3 method for class 'glmb'
print(x, digits = max(3, getOption("digits") - 3), ...)

`formula`	an object of class `"formula"` (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.
`family`	a description of the error distribution and link function to be used in the model. For `glm` this can be a character string naming a family function, a family function or the result of a call to a family function. For `glm.fit` only the third option is supported. (See `family` for details of family functions.)
`pfamily`	a description of the prior distribution and associated constants to be used in the model. This should be a pfamily function (see `pfamily` for details of pfamily functions).
`n`	number of draws to generate. If `length(n) > 1`, the length is taken to be the number required.
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `glm` is called.
`weights`	an optional vector of ‘prior weights’ to be used in the fitting process. Should be `NULL` or a numeric vector.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector of length equal to the number of cases. One or more `offset` terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See documentation for `model.offset` at `model.extract`.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` setting of `options`, and is `na.fail` if that is unset. The ‘factory-fresh’ default is `stats{na.omit}`. Another possible value is `NULL`, no action. Value `stats{na.exclude}` can be useful.
`Gridtype`	an optional argument specifying the method used to determine the number of tangent points used to construct the enveloping function.
`start`	starting values for the parameters in the linear predictor.
`etastart`	starting values for the linear predictor.
`mustart`	starting values for the vector of means.
`control`	a list of parameters for controlling the fitting process. For `glm.fit` this is passed to `glm.control`.
`model`	a logical value indicating whether model frame should be included as a component of the returned value.
`method`	the method to be used in fitting the model. The default method `"glm.fit"` uses iteratively reweighted least squares (IWLS): the alternative `"model.frame"` returns the model frame and does no fitting. User-supplied fitting functions can be supplied either as a function or a character string naming a function, with a function which takes the same arguments as `glm.fit`. If specified as a character string it is looked up from within the stats namespace.
`x`	For `glm`: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For `glm.fit`: `x` is a design matrix of dimension `n * p`, and `y` is a vector of observations of length `n`.
`y`	For `glm`: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For `glm.fit`: `x` is a design matrix of dimension `n * p`, and `y` is a vector of observations of length `n`.
`contrasts`	an optional list. See the `contrasts.arg` of `model.matrix.default`.
`...`	For `glm`: arguments to be used to form the default `control` argument if it is not supplied directly. For `weights`: further arguments passed to or from other methods.
`digits`	the number of significant digits to use when printing.

The function glmb is a Bayesian version of the classical glm function. Setup of the models mirrors those for the glm function but add additional required arguments mu and Sigma, representing a multivariate normal prior. In addition, the dispersion parameter must currently be provided for the gaussian, Gamma, quasipoisson, and quasibinomial families (future implementations may incoporate priors for these into the glmb function). The function generates random iid samples from the posterior density associated with the model (assuming a fixed dispersion parameter).

Current implemented models include the gaussian family (identity link function), the poisson and quasipoisson families (log link function), the gamma family (log link function), as well as the binomial and quasibinomial families (logit, probit, and cloglog link functions).

The function returns the output from a call to the function glm as well as the simulated Bayesian coefficients and associated outputs. In addition, the function returns model diagnostic information related to the DIC, a Bayesian Information Criterion similar to the AIC for classical models.

For the gaussian family, iid samples from the posterior density are genererated using standard simulation procedures for multivariate normal densities. For all other families, the samples are generated using accept-reject procedures by leveraging the likelihood subgradient approach of Nygren and Nygren (2006). This approach relies on tight enveloping functions that bound the posterior density from above. The Gridtype parameter is used to select the method used for determining the size of a Grid used to build the enveloping function. See EnvelopeBuild for details. Depending on the selection, the time to build the envelope and the acceptance rate during the simulation process may vary. The returned value iters contains the number of candidates generated before acceptance for each draw.

glmb returns an object of class "glmb". The function summary (i.e., summary.glmb) can be used to obtain or print a summary of the results. The generic accessor functions coefficients, fitted.values, residuals, and extractAIC can be used to extract various useful features of the value returned by codeglmb.

An object of class "glmb" is a list containing at least the following components:

`glm`	an object of class `"glm"` containing the output from a call to the function `glm`
`coefficients`	a matrix of dimension `n` by `length(mu)` with one sample in each row
`coef.means`	a vector of `length(mu)` with the estimated posterior mean coefficients
`coef.mode`	a vector of `length(mu)` with the estimated posterior mode coefficients
`dispersion`	Either a constant provided as part of the call, or a vector of length `n` with one sample in each row.
`Prior`	A list with the priors specified for the model in question. Items in list may vary based on the type of prior
`fitted.values`	a matrix of dimension `n` by `length(y)` with one sample for the mean fitted values in each row
`family`	the `family` object used.
`linear.predictors`	an `n` by `length(y)` matrix with one sample for the linear fit on the link scale in each row
`deviance`	an `n` by `1` matrix with one sample for the deviance in each row
`pD`	An Estimate for the effective number of parameters
`Dbar`	Expected value for minus twice the log-likelihood function
`Dthetabar`	Value of minus twice the log-likelihood function evaluated at the mean value for the coefficients
`DIC`	Estimated Deviance Information criterion
`prior.weights`	a vector of weights specified or implied by the model
`y`	a vector with the dependent variable
`x`	a matrix with the implied design matrix for the model
`model`	if requested (the default),the model frame
`call`	the matched call
`formula`	the formula supplie
`terms`	the `terms` object used
`data`	the `data argument`
`famfunc`	Family functions used during estimation process
`iters`	an `n` by `1` matrix giving the number of candidates generated before acceptance for each sample.
`contrasts`	(where relevant) the contrasts used.
`xlevels`	(where relevant) a record of the levels of the factors used in fitting
`digits`	the number of significant digits to use when printing.

In addition, non-empty fits will have (yet to be implemented) components qr, R and effects relating to the final weighted linear fit for the posterior mode. Objects of class "glmb" are normall of class c("glmb","glm","lm"), that is inherit from classes glm and lm and well-designed methods from those classed will be applied when appropriate.

If a binomial glmb model was specified by giving a two-column response, the weights returned by prior.weights are the total number of cases (factored by the supplied case weights) and the component of y of the result is the proportion of successes.

The R implementation of glmb has been written by Kjell Nygren and was built to be a Bayesian version of the glm function and hence tries to mirror the features of the glm function to the greatest extent possible. For details on the author(s) for the glm function see the documentation for glm.

Dobson, A. J. (1990) An Introduction to Generalized Linear Models. London: Chapman and Hall.

Hastie, T. J. and Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole. McCullagh P. and Nelder, J. A. (1989) Generalized Linear Models. London: Chapman and Hall.

Nygren, K.N. and Nygren, L.M (2006) Likelihood Subgradient Densities. Journal of the American Statistical Association. vol.101, no.475, pp 1144-1156. doi: 10.1198/016214506000000357.

Raiffa, Howard and Schlaifer, R (1961) Applied Statistical Decision Theory. Boston: Clinton Press, Inc.

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. New York: Springer.

lm and glm for classical modeling functions.

pfamily for documentation of pfamily functions used to specify priors.

Prior_Setup, Prior_Check for functions used to initialize and to check priors,

summary.glmb, predict.glmb, residuals.glmb, simulate.glmb, extractAIC.glmb, dummy.coef.glmb and methods(class="glmb") for glmb and the methods and generic functions for classes glm and lm from which class glmb inherits.

Other glmbayes modeling functions: lmb(), rglmb(), rlmb()

set.seed(333)
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
mysd<-1
mu<-matrix(0,5)
mu[1]=log(mean(counts))
V0<-((mysd)^2)*diag(5)
glmb.D93<-glmb(counts ~ outcome + treatment, family = poisson(),pfamily=dNormal(mu=mu,Sigma=V0))
summary(glmb.D93)

# Menarche Binomial Data Example 
data(menarche2)
Age2=menarche2$Age-13

#Priors Are Derived in Vignette
nvars=2
## Logit Prior and Model
mu1=as.matrix(c(0,1.098612),ncol=1)
V1<-1*diag(nvars)
V1[1,1]=0.18687882
V1[2,2]=0.10576217
V1[1,2]=-0.03389182
V1[2,1]=-0.03389182
glmb.out1<-glmb(cbind(Menarche, Total-Menarche) ~ Age2,
family=binomial(logit),pfamily=dNormal(mu=mu1,Sigma=V1), data=menarche2)
summary(glmb.out1)

## Probit Prior and Model
mu2=as.matrix(c(0,0.6407758),ncol=1)
V2<-1*diag(nvars)
V2[1,1]=0.07158369
V2[2,2]=0.03453205
V2[1,2]=-0.01512075
V2[2,1]=-0.01512075
glmb.out2<-glmb(cbind(Menarche, Total-Menarche) ~ Age2,
family=binomial(probit),pfamily=dNormal(mu=mu2,Sigma=V2), data=menarche2)
summary(glmb.out2)

## clog-log Prior and Model
mu2=as.matrix(c(-0.3665129,0.6002727),ncol=1)
V2<-1*diag(nvars)
V2[1,1]=0.11491322
V2[2,2]=0.03365986
V2[1,2]=-0.03502783
V2[2,1]=-0.03502783
glmb.out3<-glmb(cbind(Menarche, Total-Menarche) ~ Age2,
                family=binomial(cloglog),pfamily=dNormal(mu=mu2,Sigma=V2), data=menarche2)
summary(glmb.out3)

DIC_Out=rbind(extractAIC(glmb.out1),extractAIC(glmb.out2),extractAIC(glmb.out3))
rownames(DIC_Out)=c("logit","probit","clog-log")
colnames(DIC_Out)=c("pD","DIC")
DIC_Out