pva: Run a Collapsed Variational Approximation to find the K best...

View source: R/RcppExports.R

pvaR Documentation

Run a Collapsed Variational Approximation to find the K best linear models

Description

Run a Collapsed Variational Approximation to find the K best linear models

Usage

pva(
  vy_in,
  mX_in,
  mGamma_in,
  prior,
  modelprior,
  modelpriorvec_in = NULL,
  bUnique = TRUE,
  lambda = 1,
  cores = 1L
)

Arguments

vy_in

Vector of responses

mX_in

The matrix of covariates which may or may not be included in each model

mGamma_in

Matrix of initial models, a K by p logical matrix

prior

– the choice of mixture $g$-prior used to perform Bayesian model averaging. The choices available include:

  • "BIC"– the Bayesian information criterion obtained by using the cake prior of Ormerod et al. (2017).

  • "ZE"– special case of the prior structure described by Maruyama and George (2011).

  • "liang_g1"– the mixture g-prior of Liang et al. (2008) with prior hyperparameter a=3 evaluated directly using Equation (10) of Greenaway and Ormerod (2018) where the Gaussian hypergeometric function is evaluated using the gsl library. Note: this option can lead to numerical problems and is only ' meant to be used for comparative purposes.

  • "liang_g2"– the mixture g-prior of Liang et al. (2008) with prior hyperparameter a=3 evaluated directly using Equation (11) of Greenaway and Ormerod (2018).

  • "liang_g_n_appell"– the mixture g/n-prior of Liang et al. (2008) with prior hyperparameter a=3 evaluated using the appell R package.

  • "liang_g_n_approx"– the mixture g/n-prior of Liang et al. (2008) with prior hyperparameter a=3 using the approximation Equation (15) of Greenaway and Ormerod (2018) for model with more than two covariates and numerical quadrature (see below) for models with one or two covariates.

  • "liang_g_n_quad"– the mixture g/n-prior of Liang et al. (2008) with prior hyperparameter a=3 evaluated using a composite trapezoid rule.

  • "robust_bayarri1"– the robust prior of Bayarri et al. (2012) using default prior hyper parameter choices evaluated directly using Equation (18) of Greenaway and Ormerod (2018) with the gsl library.

  • "robust_bayarri2"– the robust prior of Bayarri et al. (2012) using default prior hyper parameter choices evaluated directly using Equation (19) of Greenaway and Ormerod (2018).

modelprior

The model prior to use. The choices of model prior are "uniform", "beta-binomial" or "bernoulli". The choice of model prior dictates the meaning of the parameter modelpriorvec.

modelpriorvec_in

If modelprior is "uniform", then the modelpriorvec is ignored and can be null.

If the modelprior is "beta-binomial" then modelpriorvec should be length 2 with the first element containing the alpha hyperparameter for the beta prior and the second element containing the beta hyperparameter for the beta prior.

If modelprior is "bernoulli", then modelpriorvec must be of the same length as the number of columns in mX. Each element i of modelpriorvec contains the prior probability of the the ith covariate being included in the model.

bUnique

Whether to ensure uniqueness in the population of particles or not. Defaults to true.

lambda

The weighting factor for the entropy in f_lambda. Defaults to 1.

cores

The number of cores to use. Defaults to 1.

Value

The object returned is a list containing:

  • "mGamma"– A K by p binary matrix containing the final population of models

  • "vgamma.hat"– The most probable model found by pva

  • "vlogBF"– The null-based Bayes factor for each model in the population

  • "posterior_model_probabilities"– The estimated posterior model parameters for each model in the population.

  • "posterior_inclusion_probabilities"– The estimated variable inclusion probabilities for each model in the population.

  • "vR2"– The fitted R-squared values for each model in the population.

  • "vp"– The model size for each model in the population.

References

Bayarri, M. J., Berger, J. O., Forte, A., Garcia-Donato, G., 2012. Criteria for Bayesian model choice with application to variable selection. Annals of Statistics 40 (3), 1550- 1577.

Greenaway, M. J., J. T. Ormerod (2018) Numerical aspects of Bayesian linear models averaging using mixture g-priors.

Liang, F., Paulo, R., Molina, G., Clyde, M. a., Berger, J. O., 2008. Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association 103 (481), 410-423.

Ormerod, J. T., Stewart, M., Yu, W., Romanes, S. E., 2017. Bayesian hypothesis tests with diffuse priors: Can we have our cake and eat it too?

Examples

mD <- MASS::UScrime
notlog <- c(2,ncol(MASS::UScrime))
mD[,-notlog] <- log(mD[,-notlog])

for (j in 1:ncol(mD)) {
  mD[,j] <- (mD[,j] - mean(mD[,j]))/sd(mD[,j])
}

varnames <- c(
  "log(AGE)",
  "S",
  "log(ED)",
  "log(Ex0)",
  "log(Ex1)",
  "log(LF)",
  "log(M)",
  "log(N)",
  "log(NW)",
  "log(U1)",
  "log(U2)",
  "log(W)",
  "log(X)",
  "log(prison)",
  "log(time)")

y.t <- mD$y
X.f <- data.matrix(cbind(mD[1:15]))
colnames(X.f) <- varnames
K <- 100
p <- ncol(X.f)
initial_gamma <- matrix(rbinom(K * p, 1, .5), K, p)
pva_result <- pva(y.t, X.f, initial_gamma, prior = "BIC", modelprior = "uniform",
                  modelpriorvec_in=NULL)

certifiedwaif/blma documentation built on July 8, 2023, 10:29 p.m.