quap: Compute quadratic approximate posterior distribution
In rmcelreath/rethinking: Statistical Rethinking book package

View source: R/map-quap.r

quap	R Documentation

Compute quadratic approximate posterior distribution

Description

Find mode of posterior distribution for arbitrary fixed effect models and then produce an approximation of the full posterior using the quadratic curvature at the mode.

Usage

quap( flist , data , start , method="BFGS" , hessian=TRUE , debug=FALSE , verbose=FALSE , ... )
map( flist , data , start , method="BFGS" , hessian=TRUE , debug=FALSE , verbose=FALSE , ... )

Arguments

`flist`	A formula or `alist` of formulas that define the likelihood and priors. See details.
`data`	A data frame or list containing the data
`start`	Optional named list specifying parameters and their initial values
`method`	Search method for `optim`. Defaults to `BFGS`.
`hessian`	If `TRUE`, attempts to compute the Hessian
`debug`	If `TRUE`, prints various internal steps to help with debugging
`...`	Additional arguments to pass to `optim`

Details

This command provides a convenient interface for finding quadratic approximations of posterior distributions for models defined by a formula or by a list of formulas. This procedure is equivalent to penalized maximum likelihood estimation and the use of a Hessian for profiling, and therefore can be used to define many common regularization procedures. The point estimates returned correspond to a maximum a posterior, or MAP, estimate. However the intention is that users will use extract.samples and other functions to work with the full posterior.

flist should be a either a single formula that defines the likelihood function or rather a list of formulas that define the likelihood and priors for parameters. See examples below.

Likelihood formulas take the form y ~ dfoo(bar), where y is the outcome variable, dfoo is a density function such as dnorm, and bar is a parameter of the density.

Prior formulas take the same form, but the outcome should be a parameter name. Identical priors can be defined for multiple parameters by using c(par1,par2,...) on the left hand side of the formula. See example below.

Linear models can be specified as formulas of the form mu <- a + b*x for a direct link. To use a link function, use the form link(mu) <- a + b*x or mu <- invlink(a + b*x). The names "link" and "invlink" must be recognized by map. It currently recognizes log/exp and logit/logistic. Any other link function can be coded directly into the likelihood formula. For example y ~ dfoo(invlink(mu)).

The start list is optional. For each parameter with a defined prior, an initial value will be sampled from the prior. Explicit named initial values can also be provided in this list.

Methods are defined for coef, summary, logLik, vcov, nobs, deviance, link, DIC, and show.

Value

Returns an object of class map with the following slots.

`call`	The function call
`coef`	The estimated posterior modes
`vcov`	Variance-covariance matrix
`optim`	List returned by `optim`
`data`	The data
`formula`	Formula list
`fminuslogl`	Minus log-likelihood function that accepts a vector of parameter values

Author(s)

Richard McElreath

Examples

data(cars)
flist <- alist(
    dist ~ dnorm( mu , sigma ) ,
    mu <- a+b*speed ,
    c(a,b) ~ dnorm(0,1) , 
    sigma ~ dexp(1)
)
fit <- quap( flist , start=list(a=40,b=0.1,sigma=20) , data=cars )

# regularized logistic regression example
y <- c( rep(0,10) , rep(1,10) )
x <- c( rep(-1,9) , rep(1,11) )

flist0 <- alist(
    y ~ dbinom( 1 , p ) ,
    logit(p) <- a + b*x
)

flist1 <- alist(
    y ~ dbinom( 1 , p ),
    logit(p) <- a + b*x ,
    c(a,b) ~ dnorm(0,10)
)

# without priors, same problem as:
# glm3a <- glm( y ~ x , family=binomial , data=list(y=y,x=x) )
fit3a <- quap( flist0 , data=list(y=y,x=x) , start=list(a=0,b=0) )
precis(fit3a)

# now with regularization
fit3b <- quap( flist1 , data=list(y=y,x=x) , start=list(a=0,b=0) )
precis(fit3b)

# vector parameters
data(UCBadmit)
fit4 <- quap(
    alist(
        admit ~ dbinom( apps , p ),
        logit(p) <- a[dept_id] + b*male,
        a[dept_id] ~ dnorm(0,4),
        b ~ dnorm(0,1)
    ),
    data=list(
        admit = UCBadmit$admit,
        apps = UCBadmit$applications,
        male = ifelse( UCBadmit$applicant.gender=="male" , 1 , 0 ),
        dept_id = coerce_index(UCBadmit$dept)
    )
)

rmcelreath/rethinking documentation built on Aug. 26, 2024, 5:54 p.m.