mzipBvs: The function to perform variable selection for multivariate...

View source: R/mzipBvs.R

mzipBvsR Documentation

The function to perform variable selection for multivariate zero-inflated count responses

Description

The function can be used to perform variable selection for multivariate zero-inflated count responses.

Usage

mzipBvs(Y, lin.pred, data, model = "generalized", offset = NULL, hyperParams, startValues,
 mcmcParams)

Arguments

Y

a data.frame containing q count outcomes from n subjects. It is of dimension n\times q.

lin.pred

a list containing three formula objects: the first formula specifies the p_z covariates for which variable selection is to be performed in the binary component of the model; the second formula specifies the p_x covariates for which variable selection is to be performed in the count part of the model; the third formula specifies the p_0 confounders to be adjusted for (but on which variable selection is not to be performed) in the regression analysis.

data

a data.frame containing the variables named in the formulas in lin.pred.

model

a character that specifies the type of model: A generalized multivariate Bayesian variable selection method of Lee et al.(2018) can be implemented by setting model="generalized". A simpler model that assumes one common variable selection indicator (γ_{j,k}=δ_{j,k}) and the same covariance pattern (R=R_V) for two model parts can be used by setting model="restricted1". iii) Another simpler model that assumes the same covariance pattern (R=R_V) but separate variable selection indicators for the binary and count parts of the model can be implemented by setting model="restricted2".

offset

an optional numeric vector with an a priori known component to be included as the linear predictor in the count part of model.

hyperParams

a list containing lists or vectors for hyperparameter values in hierarchical models. Components include, rho0 (degrees of freedom for inverse-Wishart prior for Σ_V), Psi0 (a scale matrix for inverse-Wishart prior for Σ_V), mu_alpha0 (hyperparameter μ_{α_0} in the prior of α_0), mu_alpha (a numeric vector of length q for hyperparameter μ_{α} in the prior of α), mu_beta0 (hyperparameter μ_{β_0} in the prior of β_0), mu_beta (a numeric vector of length q for hyperparameter μ_{β} in the prior of β), a_alpha0 (hyperparameter a_{α_0} in the prior of σ^2_{α_0}), b_alpha0 (hyperparameter b_{α_0} in the prior of σ^2_{α_0}), a_alpha (hyperparameter a_{α} in the prior of σ^2_{α}), b_alpha (hyperparameter b_{α} in the prior of σ^2_{α}), a_beta0 (hyperparameter a_{β_0} in the prior of σ^2_{β_0}), b_beta0 (hyperparameter b_{β_0} in the prior of σ^2_{β_0}), a_beta (hyperparameter a_{β} in the prior of σ^2_{β}), b_beta (hyperparameter b_{β} in the prior of σ^2_{β}), v_beta (a numeric vector of length q for the standard deviation hyperparameter v_{β} of the regression parameter β prior), omega_beta (a numeric vector of length p_x-p_0 for the hyperparameter ω_{β} in the prior of the variable selection indicator), v_alpha (a numeric vector of length q for the standard deviation hyperparameter v_{α} of the regression parameter α prior), omega_alpha (a numeric vector of length p_z-p_0 for the hyperparameter ω_{α} in the prior of the variable selection indicator), See Examples below.

startValues

a numeric vector containing starting values for model parameters. See Examples below.

mcmcParams

a list containing variables required for MCMC sampling. Components include, run (a list containing numeric values for setting the overall run: numReps, total number of scans; thin, extent of thinning; burninPerc, the proportion of burn-in). tuning (a list containing numeric values relevant to tuning parameters for specific updates in Metropolis-Hastings algorithm: beta0.prop.var, variance of the proposal density for β_0;beta.prop.var, variance of the proposal density for B;alpha.prop.var, variance of the proposal density for A;V.prop.var, variance of the proposal density for V.) See Examples below.

Value

mzipBvs returns an object of class mzipBvs.

Author(s)

Kyu Ha Lee, Brent A. Coull, Jacqueline R. Starr
Maintainer: Kyu Ha Lee <klee15239@gmail.com>

References

Lee, K. H., Coull, B. A., Moscicki, A.-B., Paster, B. J., Starr, J. R. (2020), Bayesian variable selection for multivariate zero-inflated models: application to microbiome count data, Biostatistics, Volume 21, Issue 3, Pages 499-517.

Examples


## Not run: 
# loading a data set
data(simData_mzip)
Y <- simData_mzip$Y
data <- simData_mzip$X

n = dim(Y)[1]
q = dim(Y)[2]

form.bin     <- as.formula(~cov.1)
form.count    <- as.formula(~cov.1)
form.adj    <- as.formula(~1)
lin.pred <- list(form.bin, form.count, form.adj)

Xmat0 <- model.frame(lin.pred[[1]], data=data)
Xmat1 <- model.frame(lin.pred[[2]], data=data)
Xmat_adj <- model.frame(lin.pred[[3]], data=data)

p_adj = ncol(Xmat_adj)
p0 <- ncol(Xmat0) + p_adj
p1 <- ncol(Xmat1) + p_adj

nonz <- rep(NA, q)
for(j in 1:q) nonz[j] <- sum(Y[,j] != 0)

#####################
## Hyperparameters ##

## Generalized model
##
rho0     <- q + 3 + 1
Psi0    <- diag(3, q)

mu_alpha0     <- 0
mu_alpha    <- rep(0, q)

mu_beta0    <- 0
mu_beta        <- rep(0, q)

a_alpha0    <- 0.7
b_alpha0     <- 0.7

a_alpha        <- rep(0.7, p0)
b_alpha     <- rep(0.7, p0)

a_beta0        <- 0.7
b_beta0     <- 0.7

a_beta        <- rep(0.7, p1)
b_beta         <- rep(0.7, p1)

v_beta = rep(1, q)
omega_beta = rep(0.1, p1-p_adj)
v_alpha = rep(1, q)
omega_alpha = rep(0.1, p0-p_adj)

##
hyperParams.gen <- list(rho0=rho0, Psi0=Psi0, mu_alpha0=mu_alpha0, mu_alpha=mu_alpha,
mu_beta0=mu_beta0, mu_beta=mu_beta, a_alpha0=a_alpha0, b_alpha0=b_alpha0,
a_alpha=a_alpha, b_alpha=b_alpha, a_beta0=a_beta0, b_beta0=b_beta0,
a_beta=a_beta, b_beta=b_beta, v_beta=v_beta, omega_beta=omega_beta,
v_alpha=v_alpha, omega_alpha=omega_alpha)

###################
## MCMC SETTINGS ##

## Setting for the overall run
##
numReps    <- 100
thin       <- 1
burninPerc <- 0.5

## Settings for storage
##
storeV      <-    TRUE
storeW      <-    TRUE

## Tuning parameters for specific updates
##
##  - Generalized model
beta0.prop.var    <- 0.5
alpha.prop.var    <- 0.5
beta.prop.var    <- 0.5
V.prop.var    <- 0.05

##
mcmc.gen <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
storage=list(storeV=storeV, storeW=storeW),
tuning=list(beta0.prop.var=beta0.prop.var, alpha.prop.var=alpha.prop.var,
beta.prop.var=beta.prop.var, V.prop.var=V.prop.var))

#####################
## Starting Values ##

## Generalized model
##
B <- matrix(0.1, p1, q, byrow = T)
A <- matrix(0.1, p0, q, byrow = T)

V <- matrix(rnorm(n*q, 0, 0.1), n, q)
W <- matrix(rnorm(n*q, 0, 0.1), n, q)

beta0 <- log(as.vector(apply(Y, 2, mean)))
alpha0 <- log(nonz/n / ((n-nonz)/n))

Sigma_V    <- matrix(0, q, q)
diag(Sigma_V) <- 1

R        <- matrix(0, q, q)
diag(R) <- 1

sigSq_alpha0 <- 1
sigSq_alpha <- rep(1, p0)
sigSq_beta0 <- 1
sigSq_beta <- rep(1, p1)

startValues.gen <- list(B=B, A=A, V=V, W=W, beta0=beta0, alpha0=alpha0, R=R,
sigSq_alpha0=sigSq_alpha0,
sigSq_alpha=sigSq_alpha, sigSq_beta0=sigSq_beta0, sigSq_beta=sigSq_beta, Sigma_V=Sigma_V)


###################################
## Fitting the generalized model ##
###################################
fit.gen <- mzipBvs(Y, lin.pred, data, model="generalized", offset=NULL, hyperParams.gen,
startValues.gen, mcmc.gen)

print(fit.gen)
summ.fit.gen <- summary(fit.gen); names(summ.fit.gen)
summ.fit.gen


## End(Not run)

mBvs documentation built on Feb. 16, 2023, 7:11 p.m.