bwqs_2: Fitting Bayesian Weighted Quantile Sum regression models with...

View source: R/main_package_2bwqs.R

bwqs_2R Documentation

Fitting Bayesian Weighted Quantile Sum regression models with two mixture

Description

Fits Bayesian Weighted Quantile Sum (BWQS) regressions for continuous and binomial outcomes with two different mixture. This model allows two cosider the effect of two different mixture. This approach is useful when different mixture can lead opposite effect on the outcome variable.

Usage

bwqs_2(
  formula,
  mix_name_1,
  mix_name_2,
  data,
  q,
  DalpC1 = NULL,
  DalpC2 = NULL,
  chains = 1,
  iter = 10000,
  thin = 3,
  seed = 2019,
  start_value = NULL,
  c_int = c(0.025, 0.975),
  family = "gaussian"
)

Arguments

formula

Object of class formula specifying the relationship between the outcome and the covariates of the model not involved in the mixture variable. If the model has no covariates specify y ~ NULL.

mix_name_1

A character vector listing the variables contributing to first mixture effect.

mix_name_2

A character vector listing the variables contributing to second mixture effect.

data

The data.frame containing the variables (covariates and elements of both of the mixtures) to be included in the model.

q

An integer to specify how mixture variables will be ranked, e.g. in quartiles (q = 4), deciles (q = 10), or percentiles (q = 100). If q = NULL then the values of the mixture variables are taken (these must be standardized).

DalpC1

A vector containing the parameters of the Dirichlet distribution for the weights of the first mixture. For further details see https://en.wikipedia.org/wiki/Dirichlet_distribution.

DalpC2

A vector containing the parameters of the Dirichlet distribution for the weights of the second mixture.

chains

An integer to specify the number of chain in Hamiltonian Monte Carlo algorithm (1 by default).

iter

An integer to specify the lenght of chain in Hamiltonian Monte Carlo algorithm (10000 by default).

thin

An integer to specify the thinning parameter in Hamiltonian Monte Carlo algorithm.

seed

An integer value to fix the seed. If seed = NULL the seed are randomly choosen.

start_value

A vector containing the initial value of the prior distribution, if it is equal to NULL random values are chosen.

c_int

A vector of two elements to specify the credible intervals for parameters, for 95% credible interval c_int = c(0.025,0.975) (default).

family

A string to specify the type of outcome. Possible values are "gaussian" (default) and "binomial".

Details

The function bwqs_2 uses the package rstan which allows the connection with STAN, a specific software, written in C++ for bayesian inference, for further information see https://mc-stan.org/.

Value

bwqs returns a list with two argument:

fit

An S4 object with all details of the Hamiltonian Monte Carlo, all the extractions from the posterior distribution and all values of the parameters

summary_fit

Table with the statistics of the parameters: mean, standard error of the mean, standard deviation, lower and upper values for the credible interval (with credible level specified by c_int), n_eff and Rhat. For further details see https://cran.r-project.org/web/packages/rstan/rstan.pdf

Author(s)

Nicolo Foppa Pedretti, Elena Colicino

Examples

#load libraries
library(MASS)
library(BWQS)
library(Matrix)

# fix the seed
set.seed(456)

# Sample sizes
N <- 800

# Mean & SD of variables
mu <- c(0,0,0,1,3,8,2,1,3,1)
sd <- c(1,1,1,3,1,2,1,4,5,3)
sd2 <- sd %*% t(sd)

# Correlation Matrix
#rho <- 0.65
corMat = matrix(runif(100), 10, 10)
corMat = (corMat * lower.tri(corMat)) + t(corMat * lower.tri(corMat))
diag(corMat) <- 1
corMat = nearPD(corMat, posd.tol=1.e-04)$mat

# Covariance Matrix
Sigma <- sd2*corMat

# Simulated dataset
X <- as.data.frame(mvrnorm(N, mu=mu, Sigma = Sigma, empirical=TRUE))
colnames(X)<-c("X1","X2","X3","X4","X5","X6","X7","X8","X9","X10")
#corrplot.mixed(cor(Xq), lower.col = "black", number.cex = .8)
# Intercept values
beta0<- 2

# Overall effect mix 1
beta1 <- 1

# Overall effect mix 2
beta2 <- (-2.8)

# Weights
W1 <- c(0.5,0.20,0.2,0.05,0.05)
W2 <- c(0.05,0.15,0.7,0.05,0.05)

# sigma
sigma <- 1#sd(beta1*(Xq %*% W))

# Quantile extraction
Xq <- as.matrix(quantile_split(X, mix_name = colnames(X), q=4))

# Outcome simulation (continuos)
y <- rnorm(N, beta0 + beta1*(Xq[,1:5] %*% W1) + beta2*(Xq[,6:10] %*% W2), sigma)
# Aggregate data in a data.frame
Data <-as.data.frame(cbind(y,X))

fit_2bwqs <- bwqs_2(y ~ NULL,
                    mix_name_1 = c("X1","X2","X3","X4","X5"),
                    mix_name_2 = c("X6","X7","X8","X9","X10"),
                    data = Data, q=4,
                    family = "gaussian",
                    iter = 10000, thin = 5)

fit_2bwqs$summary_fit


ElenaColicino/bwqs documentation built on Feb. 26, 2023, 12:13 a.m.