walker_glm: Bayesian generalized linear model with time-varying...

View source: R/walker.R

walker_glmR Documentation

Bayesian generalized linear model with time-varying coefficients

Description

Function walker_glm is a generalization of walker for non-Gaussian models. Compared to walker, the returned samples are based on Gaussian approximation, which can then be used for exact-approximate analysis by weighting the sample properly. These weights are also returned as a part of the stanfit (they are generated in the generated quantities block of Stan model). Note that plotting functions pp_check, plot_coefs, and plot_predict resample the posterior based on weights before plotting, leading to "exact" analysis.

Usage

walker_glm(
  formula,
  data,
  beta,
  init,
  chains,
  return_x_reg = FALSE,
  distribution,
  initial_mode = "kfas",
  u,
  mc_sim = 50,
  return_data = TRUE,
  ...
)

Arguments

formula

An object of class {formula} with additional terms rw1 and/or rw2 e.g. y ~ x1 + rw1(~ -1 + x2). See details.

data

An optional data.frame or object coercible to such, as in {lm}.

beta

A length vector of length two which defines the prior mean and standard deviation of the Gaussian prior for time-invariant coefficients

init

Initial value specification, see rstan::sampling(). Note that compared to default in rstan, here the default is a to sample from the priors.

chains

Number of Markov chains. Default is 4.

return_x_reg

If TRUE, does not perform sampling, but instead returns the matrix of predictors after processing the formula.

distribution

Either "poisson" or "binomial".

initial_mode

The initial guess of the fitted values on log-scale. Defines the Gaussian approximation used in the MCMC. Either "obs" (corresponds to log(y+0.1) in Poisson case), "glm" (mode is obtained from time-invariant GLM), "mle" (default; mode is obtained from maximum likelihood estimate of the model), or numeric vector (custom guess).

u

For Poisson model, a vector of exposures i.e. E(y) = u*exp(x*beta). For binomial, a vector containing the number of trials. Defaults 1.

mc_sim

Number of samples used in importance sampling. Default is 50.

return_data

if TRUE, returns data input to rstan::sampling(). This is needed for lfo.

...

Further arguments to rstan::sampling().

Details

The underlying idea of walker_glm is based on Vihola, Helske, Franks (2020).

walker_glm uses the global approximation (i.e. start of the MCMC) instead of more accurate but slower local approximation (where model is approximated at each iteration). However for these restricted models global approximation should be sufficient, assuming the the initial estimate of the conditional mode of p(xbeta | y, sigma) not too far away from the true posterior. Therefore by default walker_glm first finds the maximum likelihood estimates of the standard deviation parameters (using KFAS::KFAS()) package, and constructs the approximation at that point, before running the Bayesian analysis.

Value

A list containing the stanfit object, observations y, covariates xreg_fixed, and xreg_rw.

References

Vihola, M, Helske, J, Franks, J. (2020). Importance sampling type estimators based on approximate marginal Markov chain Monte Carlo. Scandinavian Journal of Statistics. 47: 1339–1376. \Sexpr[results=rd]{tools:::Rd_expr_doi("doi:10.1111/sjos.12492")}

See Also

Package diagis in CRAN, which provides functions for computing weighted summary statistics.

Examples


set.seed(1)
n <- 25
x <- rnorm(n, 1, 1)
beta <- cumsum(c(1, rnorm(n - 1, sd = 0.1)))

level <- -1
u <- sample(1:10, size = n, replace = TRUE)
y <- rpois(n, u * exp(level + beta * x))
ts.plot(y)

# note very small number of iterations for the CRAN checks!
out <- walker_glm(y ~ -1 + rw1(~ x, beta = c(0, 10), 
  sigma = c(2, 10)), distribution = "poisson", 
  iter = 200, chains = 1, refresh = 0)
print(out$stanfit, pars = "sigma_rw1") ## approximate results
if (require("diagis")) {
  weighted_mean(extract(out$stanfit, pars = "sigma_rw1")$sigma_rw1, 
    extract(out$stanfit, pars = "weights")$weights)
}
plot_coefs(out)
pp_check(out)
             
## Not run: 
data("discoveries", package = "datasets")
out <- walker_glm(discoveries ~ -1 + 
  rw2(~ 1, beta = c(0, 10), sigma = c(2, 10), nu = c(0, 2)), 
  distribution = "poisson", iter = 2000, chains = 1, refresh = 0)

plot_fit(out)

# Dummy covariate example

fit <- walker_glm(VanKilled ~ -1 + 
  rw1(~ law, beta = c(0, 1), sigma = c(2, 10)), dist = "poisson", 
   data = as.data.frame(Seatbelts), chains = 1, refresh = 0)

# compute effect * law
d <- coef(fit, transform = function(x) {
  x[, 2, 1:170] <- 0
  x
})

require("ggplot2")
d %>% ggplot(aes(time, mean)) +
  geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`), fill = "grey90") +
  geom_line() + facet_wrap(~ beta, scales = "free") + theme_bw()

## End(Not run)


walker documentation built on Sept. 11, 2024, 8:33 p.m.