Description Usage Arguments Details Value References Examples
Implements the multiMarker model via an MCMC algorithm.
1 2 3 4 5 6 |
y |
A matrix of dimension (n x P) storing P biomarker measurements on a set of n observations. Missing values ( |
quantities |
A vector of length n storing the food quantities allocated to each of the n observations in the intervention study data. Missing values ( |
niter |
The number of MCMC iterations. The default value is |
burnIn |
A numerical value, the number of iterations of the chain to be discarded when computing the posterior estimates. The default value is |
posteriors |
A logical value indicating if the full parameter chains should also be returned in output. The default value is |
sigmaAlpha |
Intercepts' hyperparameter (σ_{α^2}), see details. The default value is |
nuZ1, nuZ2 |
Are two vectors of length D storing hyperparameters for the components' variance parameters. The default values are |
nuSigmaP1, nuSigmaP2 |
Scalar hyperparameters for the error's variance parameters. The default values are |
sigmaWprior |
A scalar corresponding to the components' weights hyperparameter. The default value is |
nuBeta1, nuBeta2 |
Scalar hyperparameters for the scaling coefficient's variance parameters. The default values are |
tauBeta |
A scalar factor for the scaling coefficient's variance parameters. The default value is |
The function facilitates inference of food intake from multiple biomarkers via MCMC, according to the multiMarker model (D'Angelo et al., 2020). The multiMarker model first learns the relationship between the multiple biomarkers and food quantity data from an intervention study and subsequently allows inference on the latent intake when only biomarker data are available.
Consider a biomarker matrix Y of dimension (n x P), storing P different biomarker measurements on n independent observations. The number of food quantities considered in the intervention study is denoted by D, with the corresponding set being X=(X_1, ..., X_d, ..., X_D) and X_d < X_{d+1}.
We assume that the biomarker measurements are related to an unobserved, continuous intake value, leading to the following factor analytic model:
y_{ip} = α_p + β_p z_i +ε_{ip}, for i=1,...,n, and p = 1, ...,P,
where the latent variable z_i denotes the latent intake of observation i, with (z_1, ..., z_i, .., z_n). The α_p and β_p parameters characterize, respectively, the intercept and the scaling effect for biomarker p. We assume that these parameters are distributed a priori according to 0-truncated Gaussian distributions, with parameters (μ_{α}, σ_{α}^2 ) and (μ_{β}, σ_{β}^2 ) respectively. The error term ε_p is the variability associated with biomarker p. We assume that these errors are normally distributed with 0 mean and variance σ_p^2, which serves as a proxy for the precision of the biomarker.
A mixture of D 0-truncated Gaussian distributions is assumed as prior distribution for the latent intakes. Components are centered around food quantity values X_d, and component-specific variances θ_d^2 model food quantity-specific intake variability, with lower values suggesting higher consumption-compliance. Mixture weights are observation-specific and denoted with π_i =(π_{i1}, ..., π_{iD}). Given the inherent ordering of the food quantities in the intervention study, an ordinal regression model with Cauchit link function is employed to model the observation-specific weights.
A Bayesian hierarchical framework is employed for the modelling process, allowing quantification of the uncertainty in intake estimation, and flexibility in adapting to different biomarker data distributions. The framework is implemented through a Metropolis within Gibbs Markov chain Monte Carlo (MCMC) algorithm. Hyperprior distributions are assumed on the prior parameters with the corresponding hyperparameter values fixed based on the data at hand, following an empirical Bayes approach.
For more details on the estimation of the multiMarker model, see D'Angelo et al. (2020).
An object of class 'multiMarker'
containing the following components:
estimates |
A list with 9 components, storing posterior estimates of medians, standard deviations and 95% credible interval lower and upper bounds for the model parameters:
|
constants |
A list with 11 components, storing constant model quantities:
|
chains |
If
|
D'Angelo, S. and Brennan, L. and Gormley, I.C. (2020). Inferring food intake from multiple biomarkers using a latent variable model. arxiv
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | library(truncnorm)
oldpar <- par(no.readonly =TRUE)
#-- Simulate intervention study biomarker and food quantity data --#
P <- D <- 3; n <- 50
alpha <- rtruncnorm(P, 0, Inf, 4, 1)
beta <- rtruncnorm(P, 0, Inf, 0.001, 0.1)
x <- c(50, 100, 150)
labels_z <- sample(c(1,2,3), n, replace = TRUE)
quantities <- x[labels_z]
sigma_d <- 8
z <- rtruncnorm(n, 0, Inf, x[labels_z], sigma_d)
Y <- sapply( 1:P, function(p) sapply( 1:n, function(i)
max(0, alpha[p] + beta[p]*z[i] + rnorm( 1, 0, 5) ) ) )
#-- Visualize the data --#
par(mfrow= c(2,2))
boxplot(Y[,1] ~ quantities, xlab = "Food quantity", ylab = "Biomarker 1")
boxplot(Y[,2] ~ quantities, xlab = "Food quantity", ylab = "Biomarker 2")
boxplot(Y[,3] ~ quantities, xlab = "Food quantity", ylab = "Biomarker 3")
#-- Fit the multiMarker model --#
# Number of iterations (and burnIn) set small for example.
modM <- multiMarker(y = Y, quantities = quantities,
niter = 100, burnIn = 30,
posteriors = TRUE)
# niter and burnIn values are low only for example purposes
#-- Extract summary statistics for model parameters --#
modM$estimates$ALPHA_E[,3] #estimated median, standard deviation,
# 0.025 and 0.975 quantiles for the third intercept parameter (alpha_3)
modM$estimates$BETA_E[,2] #estimated median, standard deviation,
# 0.025 and 0.975 quantiles for the second scaling parameter (beta_2)
#-- Examine behaviour of MCMC chains --#
par(mfrow= c(2,1))
plot(modM$chains$ALPHA_c[,3], type = "l",
xlab = "Iteration (after burnin)", ylab = expression(alpha[3]) )
abline( h = mean(modM$chains$ALPHA_c[,3]), lwd = 2, col = "darkred")
plot(modM$chains$BETA_c[,2], type = "l",
xlab = "Iteration (after burnin)", ylab = expression(beta[2]) )
abline( h = mean(modM$chains$BETA_c[,2]), lwd = 2, col = "darkred")
# compute Effective Sample Size
# library(LaplacesDemon)
# ESS(modM$chains$ALPHA_c[,3]) # effective sample size for alpha_3 MCMC chain
# ESS(modM$chains$BETA_c[,2]) # effective sample size for beta_2 MCMC chain
par(oldpar)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.