STAR_gprior_gibbs_da | R Documentation |
Compute MCMC samples from the posterior and predictive
distributions of a STAR linear regression model with a g-prior.
The Monte Carlo sampler STAR_gprior
is preferred unless n
is large (> 500).
STAR_gprior_gibbs_da(
y,
X,
X_test = X,
transformation = "np",
y_max = Inf,
psi = length(y),
approx_Fz = FALSE,
approx_Fy = FALSE,
nsave = 1000,
nburn = 1000,
nskip = 0,
verbose = TRUE
)
y |
|
X |
|
X_test |
|
transformation |
transformation to use for the latent data; must be one of
|
y_max |
a fixed and known upper bound for all observations; default is |
psi |
prior variance (g-prior) |
approx_Fz |
logical; in BNP transformation, apply a (fast and stable) normal approximation for the marginal CDF of the latent data |
approx_Fy |
logical; in BNP transformation, approximate
the marginal CDF of |
nsave |
number of MCMC iterations to save |
nburn |
number of MCMC iterations to discard |
nskip |
number of MCMC iterations to skip between saving iterations, i.e., save every (nskip + 1)th draw |
verbose |
logical; if TRUE, print time remaining |
STAR defines a count-valued probability model by (1) specifying a Gaussian model for continuous *latent* data and (2) connecting the latent data to the observed data via a *transformation and rounding* operation. Here, the continuous latent data model is a linear regression.
There are several options for the transformation. First, the transformation
can belong to the *Box-Cox* family, which includes the known transformations
'identity', 'log', and 'sqrt'. Second, the transformation
can be estimated (before model fitting) using the empirical distribution of the
data y
. Options in this case include the empirical cumulative
distribution function (CDF), which is fully nonparametric ('np'), or the parametric
alternatives based on Poisson ('pois') or Negative-Binomial ('neg-bin')
distributions. For the parametric distributions, the parameters of the distribution
are estimated using moments (means and variances) of y
. The distribution-based
transformations approximately preserve the mean and variance of the count data y
on the latent data scale, which lends interpretability to the model parameters.
Lastly, the transformation can be modeled using the Bayesian bootstrap ('bnp'),
which is a Bayesian nonparametric model and incorporates the uncertainty
about the transformation into posterior and predictive inference.
a list with the following elements:
coefficients
the posterior mean of the regression coefficients
post_beta
: nsave x p
samples from the posterior distribution
of the regression coefficients
post_ytilde
: nsave x n0
samples
from the posterior predictive distribution at test points X_test
post_g
: nsave
posterior samples of the transformation
evaluated at the unique y
values (only applies for 'bnp' transformations)
# Simulate some data:
sim_dat = simulate_nb_lm(n = 500, p = 10)
y = sim_dat$y; X = sim_dat$X
# Fit a linear model:
fit = STAR_gprior_gibbs_da(y, X,
transformation = 'np',
nsave = 1000, nburn = 1000)
names(fit)
# Check the efficiency of the MCMC samples:
getEffSize(fit$post_beta)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.