ed.stan: Empirical determinacy for Stan
In hunansona/ed4bhm: Empirical Determinacy for Bayesian Hierarchical Models

ed.stan

R Documentation

Empirical determinacy for Stan

Description

Generic function. Provides empirical determinacy estimates TED, EDL, EDS, pEDL, and pEDS from a Stan model. This function takes the Stan fit object from the base model as input and returns the identification values, descritive.matrix and delta.

Usage

ed.stan(stan.object.base,
                    prec.name = NULL, 
                    delta = 0.01,
                    distance = "H2ALL")

Arguments

`stan.object.base`	Stan fit object from the base model with a column called "ll", which contains the log-likelihood evaluated at MCMC sample of parameters.
`prec.name`	NULL or character, specifies the names of parameters (in JAGS and INLA precisions but in Stan standard deviations) to be log-transformed
`delta`	numeric, the numerical differentiation step and w = 1 \pm δ is the weight, default value is 0.01.
`distance`	character string, specifies the type of the empirical determinacy measure. It can have values "H2" for TED, "BCL" for EDL and "BCS" for EDS and "H2ALL" for all the five measures (TED, EDL, EDS, pEDL and pEDS). The default value is "H2ALL".

Value

`H2ALL measures`	a matrix of identification values
`descriptive matrix`	a list of descriptive matrix and δ

Author(s)

Malgorzata Roos, Georgios Kazantzidis, Sona Hunanyan

References

Hunanyan, S. and Kazantzidis, G. and Rue, H. and Roos, M (2021). Empirical determinacy in the Bayesian logistic regression in the presence of separation in the data. (in preparation).

Examples

############################################################
##########---Bayesian logisic regression in Stan---#########
data("bacterial_resistance")
df_ipm <- data.frame(cbind(bacterial_resistance$scIZD, 
                            bacterial_resistance$nipmesbl, 
                            bacterial_resistance$nipmwt))
colnames(df_ipm) <- c("scIZD", "nipmesbl", "nipmwt")

###################---Stan with binomial data---------########
params <- c("alpha","beta", "ll_gq")

stan_data <- list(N=nrow(df_ipm), y =  df_ipm$nipmesbl, 
                             total = df_ipm$nipmesbl+df_ipm$nipmwt,
                             x1 = as.numeric(df_ipm$scIZD))


stan_inits <- list(list(alpha = -1.498195, beta = -0.3731706),
                   list(alpha = -1.398195, beta = -0.2731706),
                   list(alpha = -1.460195, beta = -0.3351706),
                   list(alpha = -1.436195, beta = -0.3111706))
library(rstan)

## Content of the Stan model
modelString = " // open quote for modelString
data {
int<lower=0> N;             // number of observations
int<lower=0> y[N];  // setting the dependent variable y as binomial
vector[N] x1;                // independent variable x1
int<lower=0> total[N]; // total observation for each x1
}

parameters {
real alpha; // logistic regression intercept
real beta; // logistic regression slope
}

model {
y ~ binomial_logit(total, alpha + beta*x1); 
// model for the logistic regression with a an aggregated binomial outcome
alpha ~ normal(0, 10);
beta ~ normal(0, 2.5);
}
generated quantities {
real ll_gq;
vector[N] log_lik;

for (n in 1:N) log_lik[n] = binomial_logit_lpmf(y[n] | total[n], alpha + 
                              x1[n] * beta );
ll_gq = sum(log_lik);
}
" # close quote for modelString

# Translate model to C++ and compile to DSO (dynamic shared object):
model_stanDSO <- stan_model(model_code = modelString)


stan_ipm <- sampling(
  object = model_stanDSO,
  data = stan_data,
  pars = params,
  chains = 4,
  iter = 6*(10^3),
  warmup = 2*(10^3),
  thin = 100,
  seed = 1234,
  init = stan_inits)

fit.rstan_ipm_m <- as.matrix(stan_ipm)

# we put the MCMC sample and the log-likelihood computed 
# by hand together in a matrix
fit.rstan_ipm_m_ll <- as.matrix(cbind(fit.rstan_ipm_m[,c("alpha", 
                                        "beta", "ll_gq")])) 

# we change the name of the calumn containing the log-likelihood 
# computed by generated quantities in Stan by ll
colnames(fit.rstan_ipm_m_ll) <- c("alpha", "beta", "ll") 

#the median and the 95% CrIs of the posterior parameters
descriptives95_stan_ipm <- extract_median_95CrI_descriptives_stan(
                              fit.rstan_ipm_m_ll, 
                              w = 1, prec.name = NULL)

#empirical determinacy estimates
ed_stan_ipm <- ed.stan(stan.object.base = fit.rstan_ipm_m_ll, 
                        distance = "H2ALL", delta = 0.01, prec.name=NULL)

#Monte Carlo standard error for empirical determinacy estimates
mce_ed_stan_ipm <- mce.ed(object = fit.rstan_ipm_m_ll, 
                                  B = 200, distance = "H2ALL", 
                                  delta = 0.01, prec.name = NULL)

hunansona/ed4bhm documentation built on June 15, 2022, 6:42 p.m.