bwqs_2: Fitting Bayesian Weighted Quantile Sum regression models with...
In ElenaColicino/bwqs: Bayesian Weighted Quantile Sum Regression

View source: R/main_package_2bwqs.R

bwqs_2

R Documentation

Fitting Bayesian Weighted Quantile Sum regression models with two mixture

Description

Fits Bayesian Weighted Quantile Sum (BWQS) regressions for continuous and binomial outcomes with two different mixture. This model allows two cosider the effect of two different mixture. This approach is useful when different mixture can lead opposite effect on the outcome variable.

Usage

bwqs_2(
  formula,
  mix_name_1,
  mix_name_2,
  data,
  q,
  DalpC1 = NULL,
  DalpC2 = NULL,
  chains = 1,
  iter = 10000,
  thin = 3,
  seed = 2019,
  start_value = NULL,
  c_int = c(0.025, 0.975),
  family = "gaussian"
)

Arguments

`formula`	Object of class `formula` specifying the relationship between the outcome and the covariates of the model not involved in the mixture variable. If the model has no covariates specify `y ~ NULL`.
`mix_name_1`	A character vector listing the variables contributing to first mixture effect.
`mix_name_2`	A character vector listing the variables contributing to second mixture effect.
`data`	The `data.frame` containing the variables (covariates and elements of both of the mixtures) to be included in the model.
`q`	An `integer` to specify how mixture variables will be ranked, e.g. in quartiles (`q = 4`), deciles (`q = 10`), or percentiles (`q = 100`). If `q = NULL` then the values of the mixture variables are taken (these must be standardized).
`DalpC1`	A `vector` containing the parameters of the Dirichlet distribution for the weights of the first mixture. For further details see https://en.wikipedia.org/wiki/Dirichlet_distribution.
`DalpC2`	A `vector` containing the parameters of the Dirichlet distribution for the weights of the second mixture.
`chains`	An `integer` to specify the number of chain in Hamiltonian Monte Carlo algorithm (1 by default).
`iter`	An `integer` to specify the lenght of chain in Hamiltonian Monte Carlo algorithm (10000 by default).
`thin`	An `integer` to specify the thinning parameter in Hamiltonian Monte Carlo algorithm.
`seed`	An `integer` value to fix the seed. If `seed = NULL` the seed are randomly choosen.
`start_value`	A `vector` containing the initial value of the prior distribution, if it is equal to NULL random values are chosen.
`c_int`	A `vector` of two elements to specify the credible intervals for parameters, for 95% credible interval `c_int = c(0.025,0.975)` (default).
`family`	A `string` to specify the type of outcome. Possible values are "gaussian" (default) and "binomial".

Details

The function bwqs_2 uses the package rstan which allows the connection with STAN, a specific software, written in C++ for bayesian inference, for further information see https://mc-stan.org/.

Value

bwqs returns a list with two argument:

`fit`	An `S4` object with all details of the Hamiltonian Monte Carlo, all the extractions from the posterior distribution and all values of the parameters
`summary_fit`	Table with the statistics of the parameters: mean, standard error of the mean, standard deviation, lower and upper values for the credible interval (with credible level specified by `c_int`), n_eff and Rhat. For further details see https://cran.r-project.org/web/packages/rstan/rstan.pdf

Author(s)

Nicolo Foppa Pedretti, Elena Colicino

Examples

#load libraries
library(MASS)
library(BWQS)
library(Matrix)

# fix the seed
set.seed(456)

# Sample sizes
N <- 800

# Mean & SD of variables
mu <- c(0,0,0,1,3,8,2,1,3,1)
sd <- c(1,1,1,3,1,2,1,4,5,3)
sd2 <- sd %*% t(sd)

# Correlation Matrix
#rho <- 0.65
corMat = matrix(runif(100), 10, 10)
corMat = (corMat * lower.tri(corMat)) + t(corMat * lower.tri(corMat))
diag(corMat) <- 1
corMat = nearPD(corMat, posd.tol=1.e-04)$mat

# Covariance Matrix
Sigma <- sd2*corMat

# Simulated dataset
X <- as.data.frame(mvrnorm(N, mu=mu, Sigma = Sigma, empirical=TRUE))
colnames(X)<-c("X1","X2","X3","X4","X5","X6","X7","X8","X9","X10")
#corrplot.mixed(cor(Xq), lower.col = "black", number.cex = .8)
# Intercept values
beta0<- 2

# Overall effect mix 1
beta1 <- 1

# Overall effect mix 2
beta2 <- (-2.8)

# Weights
W1 <- c(0.5,0.20,0.2,0.05,0.05)
W2 <- c(0.05,0.15,0.7,0.05,0.05)

# sigma
sigma <- 1#sd(beta1*(Xq %*% W))

# Quantile extraction
Xq <- as.matrix(quantile_split(X, mix_name = colnames(X), q=4))

# Outcome simulation (continuos)
y <- rnorm(N, beta0 + beta1*(Xq[,1:5] %*% W1) + beta2*(Xq[,6:10] %*% W2), sigma)
# Aggregate data in a data.frame
Data <-as.data.frame(cbind(y,X))

fit_2bwqs <- bwqs_2(y ~ NULL,
                    mix_name_1 = c("X1","X2","X3","X4","X5"),
                    mix_name_2 = c("X6","X7","X8","X9","X10"),
                    data = Data, q=4,
                    family = "gaussian",
                    iter = 10000, thin = 5)

fit_2bwqs$summary_fit

ElenaColicino/bwqs documentation built on Feb. 26, 2023, 12:13 a.m.