mzipBvs: The function to perform variable selection for conditional...
In mBvs: Bayesian Variable Selection Methods for Multivariate Data

View source: R/mzipBvs.R

mzipBvs

R Documentation

The function to perform variable selection for conditional multivariate zero-inflated Poisson models

Description

The function can be used to perform variable selection for conditional multivariate zero-inflated Poisson models.

Usage

mzipBvs(Y, lin.pred, data, model = "generalized", offset = NULL, hyperParams, startValues,
 mcmcParams)

Arguments

`Y`	a data.frame containing `q` count outcomes from `n` subjects. It is of dimension `n\times q`.
`lin.pred`	a list containing three formula objects: the first formula specifies the `p_z` covariates for which variable selection is to be performed in the binary component of the model; the second formula specifies the `p_x` covariates for which variable selection is to be performed in the count part of the model; the third formula specifies the `p_0` confounders to be adjusted for (but on which variable selection is not to be performed) in the regression analysis.
`data`	a data.frame containing the variables named in the formulas in `lin.pred`.
`model`	a character that specifies the type of model: A generalized multivariate Bayesian variable selection method of Lee et al.(2018) can be implemented by setting `model="generalized"`. A simpler model that assumes one common variable selection indicator (`\gamma_{j,k}=\delta_{j,k}`) and the same covariance pattern (`R=R_V`) for two model parts can be used by setting `model="restricted1"`. iii) Another simpler model that assumes the same covariance pattern (`R=R_V`) but separate variable selection indicators for the binary and count parts of the model can be implemented by setting `model="restricted2"`.
`offset`	an optional numeric vector with an a priori known component to be included as the linear predictor in the count part of model.
`hyperParams`	a list containing lists or vectors for hyperparameter values in hierarchical models. Components include, `rho0` (degrees of freedom for inverse-Wishart prior for `\Sigma_V`), `Psi0` (a scale matrix for inverse-Wishart prior for `\Sigma_V`), `mu_alpha0` (hyperparameter `\mu_{\alpha_0}` in the prior of `\alpha_0`), `mu_alpha` (a numeric vector of length `q` for hyperparameter `\mu_{\alpha}` in the prior of `\alpha`), `mu_beta0` (hyperparameter `\mu_{\beta_0}` in the prior of `\beta_0`), `mu_beta` (a numeric vector of length `q` for hyperparameter `\mu_{\beta}` in the prior of `\beta`), `a_alpha0` (hyperparameter `a_{\alpha_0}` in the prior of `\sigma^2_{\alpha_0}`), `b_alpha0` (hyperparameter `b_{\alpha_0}` in the prior of `\sigma^2_{\alpha_0}`), `a_alpha` (hyperparameter `a_{\alpha}` in the prior of `\sigma^2_{\alpha}`), `b_alpha` (hyperparameter `b_{\alpha}` in the prior of `\sigma^2_{\alpha}`), `a_beta0` (hyperparameter `a_{\beta_0}` in the prior of `\sigma^2_{\beta_0}`), `b_beta0` (hyperparameter `b_{\beta_0}` in the prior of `\sigma^2_{\beta_0}`), `a_beta` (hyperparameter `a_{\beta}` in the prior of `\sigma^2_{\beta}`), `b_beta` (hyperparameter `b_{\beta}` in the prior of `\sigma^2_{\beta}`), `v_beta` (a numeric vector of length `q` for the standard deviation hyperparameter `v_{\beta}` of the regression parameter `\beta` prior), `omega_beta` (a numeric vector of length `p_x-p_0` for the hyperparameter `\omega_{\beta}` in the prior of the variable selection indicator), `v_alpha` (a numeric vector of length `q` for the standard deviation hyperparameter `v_{\alpha}` of the regression parameter `\alpha` prior), `omega_alpha` (a numeric vector of length `p_z-p_0` for the hyperparameter `\omega_{\alpha}` in the prior of the variable selection indicator), See Examples below.
`startValues`	a numeric vector containing starting values for model parameters. See Examples below.
`mcmcParams`	a list containing variables required for MCMC sampling. Components include, `run` (a list containing numeric values for setting the overall run: `numReps`, total number of scans; `thin`, extent of thinning; `burninPerc`, the proportion of burn-in). `tuning` (a list containing numeric values relevant to tuning parameters for specific updates in Metropolis-Hastings algorithm: `beta0.prop.var`, variance of the proposal density for `\beta_0`;`beta.prop.var`, variance of the proposal density for `B`;`alpha.prop.var`, variance of the proposal density for `A`;`V.prop.var`, variance of the proposal density for `V`.) See Examples below.

Value

mzipBvs returns an object of class mzipBvs.

Author(s)

Kyu Ha Lee, Brent A. Coull, Jacqueline R. Starr
Maintainer: Kyu Ha Lee <klee15239@gmail.com>

References

Lee, K. H., Coull, B. A., Moscicki, A.-B., Paster, B. J., Starr, J. R. (2020), Bayesian variable selection for multivariate zero-inflated models: application to microbiome count data, Biostatistics, Volume 21, Issue 3, Pages 499-517.

Examples


## Not run: 
# loading a data set
data(simData_mzip)
Y <- simData_mzip$Y
data <- simData_mzip$X

n = dim(Y)[1]
q = dim(Y)[2]

form.bin     <- as.formula(~cov.1)
form.count    <- as.formula(~cov.1)
form.adj    <- as.formula(~1)
lin.pred <- list(form.bin, form.count, form.adj)

Xmat0 <- model.frame(lin.pred[[1]], data=data)
Xmat1 <- model.frame(lin.pred[[2]], data=data)
Xmat_adj <- model.frame(lin.pred[[3]], data=data)

p_adj = ncol(Xmat_adj)
p0 <- ncol(Xmat0) + p_adj
p1 <- ncol(Xmat1) + p_adj

nonz <- rep(NA, q)
for(j in 1:q) nonz[j] <- sum(Y[,j] != 0)

#####################
## Hyperparameters ##

## Generalized model
##
rho0     <- q + 3 + 1
Psi0    <- diag(3, q)

mu_alpha0     <- 0
mu_alpha    <- rep(0, q)

mu_beta0    <- 0
mu_beta        <- rep(0, q)

a_alpha0    <- 0.7
b_alpha0     <- 0.7

a_alpha        <- rep(0.7, p0)
b_alpha     <- rep(0.7, p0)

a_beta0        <- 0.7
b_beta0     <- 0.7

a_beta        <- rep(0.7, p1)
b_beta         <- rep(0.7, p1)

v_beta = rep(1, q)
omega_beta = rep(0.1, p1-p_adj)
v_alpha = rep(1, q)
omega_alpha = rep(0.1, p0-p_adj)

##
hyperParams.gen <- list(rho0=rho0, Psi0=Psi0, mu_alpha0=mu_alpha0, mu_alpha=mu_alpha,
mu_beta0=mu_beta0, mu_beta=mu_beta, a_alpha0=a_alpha0, b_alpha0=b_alpha0,
a_alpha=a_alpha, b_alpha=b_alpha, a_beta0=a_beta0, b_beta0=b_beta0,
a_beta=a_beta, b_beta=b_beta, v_beta=v_beta, omega_beta=omega_beta,
v_alpha=v_alpha, omega_alpha=omega_alpha)

###################
## MCMC SETTINGS ##

## Setting for the overall run
##
numReps    <- 100
thin       <- 1
burninPerc <- 0.5

## Settings for storage
##
storeV      <-    TRUE
storeW      <-    TRUE

## Tuning parameters for specific updates
##
##  - Generalized model
beta0.prop.var    <- 0.5
alpha.prop.var    <- 0.5
beta.prop.var    <- 0.5
V.prop.var    <- 0.05

##
mcmc.gen <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
storage=list(storeV=storeV, storeW=storeW),
tuning=list(beta0.prop.var=beta0.prop.var, alpha.prop.var=alpha.prop.var,
beta.prop.var=beta.prop.var, V.prop.var=V.prop.var))

#####################
## Starting Values ##

## Generalized model
##
B <- matrix(0.1, p1, q, byrow = T)
A <- matrix(0.1, p0, q, byrow = T)

V <- matrix(rnorm(n*q, 0, 0.1), n, q)
W <- matrix(rnorm(n*q, 0, 0.1), n, q)

beta0 <- log(as.vector(apply(Y, 2, mean)))
alpha0 <- log(nonz/n / ((n-nonz)/n))

Sigma_V    <- matrix(0, q, q)
diag(Sigma_V) <- 1

R        <- matrix(0, q, q)
diag(R) <- 1

sigSq_alpha0 <- 1
sigSq_alpha <- rep(1, p0)
sigSq_beta0 <- 1
sigSq_beta <- rep(1, p1)

startValues.gen <- list(B=B, A=A, V=V, W=W, beta0=beta0, alpha0=alpha0, R=R,
sigSq_alpha0=sigSq_alpha0,
sigSq_alpha=sigSq_alpha, sigSq_beta0=sigSq_beta0, sigSq_beta=sigSq_beta, Sigma_V=Sigma_V)


###################################
## Fitting the generalized model ##
###################################
fit.gen <- mzipBvs(Y, lin.pred, data, model="generalized", offset=NULL, hyperParams.gen,
startValues.gen, mcmc.gen)

print(fit.gen)
summ.fit.gen <- summary(fit.gen); names(summ.fit.gen)
summ.fit.gen


## End(Not run)

mBvs documentation built on May 29, 2024, 3:31 a.m.