The function to perform variable selection for multivariate normal responses

Description

The function can be used to perform variable selection for multivariate normal responses incorporating not only information on the mean model, but also information on the variance-covariance structure of the outcomes. A multivariate prior is specified on the latent binary selection indicators to incorporate the dependence between outcomes into the variable selection procedure.

Usage

1
mvnBvs(Y, lin.pred, data, model = "unstructured", hyperParams, startValues, mcmcParams)

Arguments

Y

a data.frame containing q continuous multivariate outcomes from n subjects. It is of dimension n\times q.

lin.pred

a list containing two formula objects: the first formula specifies the p covariates for which variable selection is to be performed; the second formula specifies the confounders to be adjusted for (but on which variable selection is not to be performed) in the regression analysis.

data

a data.frame containing the variables named in the formulas in lin.pred.

model

a character that specifies the covariance structure of the model: either "unstructured" or "factor-analytic".

hyperParams

a list containing lists or vectors for hyperparameter values in hierarchical models. Components include, eta (a numeric value for the hyperparameter η that regulates the extent to which the correlation between response variables influences the prior of the variable selection indicator), v (a numeric vector of length q for the standard deviation hyperparameter v of the regression parameter β prior), omega (a numeric vector of length p for the hyperparameter ω in the prior of the variable selection indicator), beta0 (a numeric vector of length q+1 for hyperparameter μ_0 and h_0 in the prior of the intercept β_0), US (a list containing numeric vectors for hyperparameters in the unstructured model: US.Sigma), FA (a list containing numeric vectors for hyperparameters in the factor-analytic model: lambda and sigmaSq). See Examples below.

startValues

a numeric vector containing starting values for model parameters: c(beta0, B, gamma, Sigma) for the unstructured model; c(beta0, B, gamma, sigmaSq, lambda) for the factor-analytic model. See Examples below.

mcmcParams

a list containing variables required for MCMC sampling. Components include, run (a list containing numeric values for setting the overall run: numReps, total number of scans; thin, extent of thinning; burninPerc, the proportion of burn-in). tuning (a list containing numeric values relevant to tuning parameters for specific updates in Metropolis-Hastings algorithm: mhProp_beta_var, variance of the proposal density for B; mhrho_prop, degrees of freedom of the inverse-Wishart proposal density for Σ in the unstructured model; mhPsi_prop, scale matrix of inverse-Wishart proposal density for Σ in the unstructured model; mhProp_lambda_var, variance of the proposal density for λ in the factor-analytic model. See Examples below.

Value

mvnBvs returns an object of class mvnBvs.

Author(s)

Kyu Ha Lee, Mahlet G. Tadesse, Brent A. Coull
Maintainer: Kyu Ha Lee <klee@hsph.harvard.edu>

References

Lee, K. H., Tadesse, M. G., Baccarelli, A. A., Schwartz J., and Coull, B. A. (2015), Multivariate Bayesian Variable Selection Exploiting Dependence Structure Among Outcomes: Application to Air Pollution Effects on DNA Methylation, submitted.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# loading a data set	
data(simData)
Y <- simData$Y
data <- simData$X
form1 <- as.formula( ~ cov.1+cov.2)
form2 <- as.formula( ~ 1)
lin.pred <- list(form1, form2)

p 		<- dim(data)[2]
p_adj 	<- 0
q 		<- dim(Y)[2]

#####################
## Hyperparameters ##

## Common hyperparameters
##
eta = 0.1
v = rep(10, q)
omega = rep(log(0.5/(1-0.5)), p-p_adj)
common.beta0 <- c(rep(0, q), 10^6)

## Unstructured model
##
rho0 	<- q + 4
Psi0	<- diag(3, q)
US.Sigma <- c(rho0, Psi0)

## Factor-analytic model
##
FA.lam		<- c(rep(0, q), 10^6)
FA.sigSq	<- c(2, 1)

##
hyperParams <- list(eta=eta, v=v, omega=omega, beta0=common.beta0,
					US=list(US.Sigma=US.Sigma),
					FA=list(lambda=FA.lam, sigmaSq=FA.sigSq))
                    
###################
## MCMC SETTINGS ##

## Setting for the overall run
##
numReps    <- 100
thin       <- 1
burninPerc <- 0.5

## Tuning parameters for specific updates
##
##  - those common to all models
mhProp_beta_var  <- matrix(0.5, p+p_adj, q)
##
## - those specific to the unstructured model
mhrho_prop <- 1000
mhPsi_prop <- diag(1, q)
##
## - those specific to the factor-analytic model
mhProp_lambda_var  <- 0.5
      
##
mcmc.US <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
                tuning=list(mhProp_beta_var=mhProp_beta_var,
                				mhrho_prop=mhrho_prop, mhPsi_prop=mhPsi_prop))
                
##
mcmc.FA  <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
                tuning=list(mhProp_beta_var=mhProp_beta_var,
                			mhProp_lambda_var=mhProp_lambda_var))
                			
#####################
## Starting Values ##

##  - those common to all models
beta0		<- rep(0, q)
B			<- matrix(sample(x=c(0.3, 0), size=q, replace = TRUE), p+p_adj, q)
gamma		<- B
gamma[gamma != 0]	<- 1
##
## - those specific to the unstructured model
Sigma <- diag(1, q) 
##
## - those specific to the factor-analytic model
lambda		<- rep(0.5, q)
sigmaSq		<- 1
                
####################################
## Fitting the unstructured model ##
####################################

startValues	<- vector("list", 2)                
 
startValues[[1]] <- as.vector(c(beta0, B, gamma, Sigma))

beta0		<- rep(0.2, q)
Sigma <- diag(0.5, q) 

startValues[[2]] <- as.vector(c(beta0, B, gamma, Sigma))

fit.us <- mvnBvs(Y, lin.pred, data, model="unstructured", hyperParams, 
				startValues, mcmcParams=mcmc.US)
				
fit.us
summ.fit.us <- summary(fit.us); names(summ.fit.us)
summ.fit.us				

#######################################
## Fitting the factor-analytic model ##
#######################################

startValues	<- vector("list", 2)                
 
startValues[[1]] <- as.vector(c(beta0, B, gamma, sigmaSq, lambda))

beta0		<- rep(0.2, q)
sigmaSq		<- 0.5
startValues[[2]] <- as.vector(c(beta0, B, gamma, sigmaSq, lambda))

fit.fa <- mvnBvs(Y, lin.pred, data, model="factor-analytic", hyperParams, 
				startValues, mcmcParams=mcmc.FA)
				
fit.fa
summ.fit.fa <- summary(fit.fa); names(summ.fit.fa)
summ.fit.fa