lmrm | R Documentation |
Implements an MCMC algorithm for posterior sampling based on a semiparametric model for continuous longitudinal multivariate responses. The overall model consists of 5 regression submodels and it utilizes spike-slab priors for variable selection and function regularization. See ‘Details’ section for a full description of the model.
lmrm(formula, data = list(), centre=TRUE, id, time,
sweeps, burn = 0, thin = 1, seed, StorageDir,
c.betaPrior = "IG(0.5,0.5*n*p)", pi.muPrior = "Beta(1,1)",
c.alphaPrior = "IG(1.1,1.1)", pi.phiPrior = "Beta(1,1)", c.psiPrior = "HN(2)",
sigmaPrior = "HN(2)", pi.sigmaPrior = "Beta(1,1)",
corr.Model = c("common", nClust = 1), DP.concPrior = "Gamma(5,2)",
c.etaPrior = "IG(0.5,0.5*samp)", pi.nuPrior = "Beta(1,1)",
pi.fiPrior = "Beta(1,1)", c.omegaPrior = "IG(1.1,1.1)", sigmaCorPrior = "HN(2)",
tuneCalpha, tuneSigma2, tuneCbeta, tuneAlpha, tuneSigma2R, tuneR, tuneCpsi,
tuneCbCor, tuneOmega, tuneComega, tau, FT = 1,...)
formula |
a formula defining the responses and the covariates in the 5 regression models e.g. |
data |
a data frame. |
centre |
Binary indicator. If set equal to TRUE, the design matrices are centred, to have column mean equl to zero, otherwise, if set to FALSE, the columns are not centred. |
id |
identifiers of the individuals or other sampling units that are observed over time. |
time |
a vector input that specifies the time of observation |
sweeps |
total number of posterior samples, including those discarded in burn-in period
(see argument |
burn |
length of burn-in period. |
thin |
thinning parameter. |
seed |
optional seed for the random generator. |
StorageDir |
a required directory to store files with the posterior samples of models parameters. |
c.betaPrior |
The inverse Gamma prior of |
pi.muPrior |
The Beta prior of |
c.alphaPrior |
The inverse Gamma prior of |
pi.phiPrior |
The Beta prior of |
c.psiPrior |
The prior of |
sigmaPrior |
The prior of |
pi.sigmaPrior |
The Beta prior of |
corr.Model |
Specifies the model for the correlation matrices |
DP.concPrior |
The Gamma prior for the Dirichlet process concentration parameter. |
c.etaPrior |
The inverse Gamma prior of |
pi.nuPrior |
The Beta prior of |
pi.fiPrior |
The Beta prior of |
c.omegaPrior |
The prior of |
sigmaCorPrior |
The prior of |
tuneCalpha |
Starting value of the tuning parameter for sampling |
tuneSigma2 |
Starting value of the tuning parameter for sampling |
tuneCbeta |
Starting value of the tuning parameter for sampling |
tuneAlpha |
Starting value of the tuning parameter for sampling regression coefficients of the variance models
|
tuneSigma2R |
Starting value of the tuning parameter for sampling |
tuneR |
Starting value of the tuning parameter for sampling correlation matrices.
Defaults at |
tuneCpsi |
Starting value of the tuning parameter for sampling variances |
.
tuneCbCor |
Starting value of the tuning parameter for sampling |
tuneOmega |
Starting value of the tuning parameter for sampling regression coefficients of the variance models
|
tuneComega |
Starting value of the tuning parameter for sampling |
tau |
The |
FT |
Binary indicator. If set equal to 1, the Fisher's z transform of the correlations is modelled, otherwise if set equal to 0, the untransformed correlations are modelled. |
... |
Other options that will be ignored. |
Function lmrm
returns samples from the posterior distributions of the parameters of
a regression model with normally distributed multivariate longitudinal responses. To describe the model,
let Y_{ij} = (Y_{ij1},\dots,Y_{ijp})^{\top}
denote the vector of p
responses
observed on individual i, i=1,\dots,n,
at time point t_{ij}, j=1,\dots,n_i
.
The observational time points t_{ij}
are allowed to be unequally spaced.
Further, let u_{ij}
denote the covariate vector that is observed along with Y_{ij}
and that may include time,
other time-dependent covariates and time-independent ones.
In addition, let Y_{i}=(Y_{i1}^{\top},\dots,Y_{in_i}^{\top})^{\top}
denote the i
th response vector.
With \mu_i=E(Y_{i})
and \Sigma_i =
cov(Y_i)
, the model assumes multivariate normality,
Y_{i} \sim N(\mu_i, \Sigma_i), i=1,2,\dots,n
.
The means \mu_i
and covariance matrices \Sigma_i
are modelled semiparametrically in terms of covariates.
For the means one can specify semiparametric models,
\mu_{ijk} = \beta_{k0} + \sum_{l=1}^{K_1} u_{ijl} \beta_{kl} + \sum_{l=K_1+1}^{K} f_{\mu,k,l}(u_{ijl}).
This is the first of the 5 regression submodels.
To model the covariance matrix, first consider the modified Cholesky decomposition,
L_i \Sigma_i L_i^{\top} = D_i,
where L_i
is a unit block lower triangular matrix
and D_i
is a block diagonal matrix,
\begin{array}{cc}
L_i
= \left[ \begin{array}{cccc}
I & 0 & \dots & 0 \\
-\Phi_{i21} & I & \dots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
-\Phi_{in_i1} & -\Phi_{in_i1} & \dots & I \\
\end{array}
\right],
&
D_i
= \left[ \begin{array}{cccc}
D_{1} & 0 & \dots & 0 \\
0 & D_{2} & 0 & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & 0 & D_{n_i} \\
\end{array}
\right],
\end{array}
For modelling D_{ij}, i=1,\dots,n, j=1,\dots,n_i
in terms of covariates,
first we separate the variances and the correlations D_{ij} = S_{ij}^{1/2} R_{ij} S_{ij}^{1/2}
.
It is easy to model matrix S_{ij}
in terms of covariates as the only requirement on its diagonal
elements is that they are nonnegative,
\log \sigma^2_{ijk} = \alpha_{k0} + \sum_{l=1}^{L_1} w_{ijl} \alpha_{kl} + \sum_{l=L_1+1}^{L} f_{\sigma,k,l}(w_{ijl})
This is the second of the 5 regression submodels.
For \phi_{ijklm}
, the (l,m)
element of \Phi_{ijk}, l,m=1,\dots,p
, one can specify semiparametric models
\phi_{ijklm} = \psi_{lm0} + \sum_{b=1}^{B_1} v_{ijkb} \psi_{lmb} + \sum_{b=B_1+1}^{B} f_{\phi,l,m,b}(v_{ijkb})
This is the third of the 5 regression submodels.
The elements of the correlations matrices R_{ij}
are modelled in terms of covariate time only, hence they are denoted by R_t
.
Subject to the positive definiteness constraint, the elements of R_t
are modelled using a normal distribution
with location and scale parameters, \mu_{ct}
and \sigma^2_{ct}
, modelled as
\mu_{ct} = \eta_0 + f_{\mu}(t),
\log \sigma^2_{ct} = \omega_0 + f_{\sigma}(t),
and these are the last 2 of the 5 submodels.
Function lmrm
returns samples from the posteriors of the model parameters.
Georgios Papageorgiou gpapageo@gmail.com
Papageorgiou, G. (2020). Bayesian semiparametric modelling of covariance matrices for multivariate longitudinal data. arXiv:2012.09833.
# Fit a joint mean-covariance model on the simulated dataset simD2
require(ggplot2)
data(simD2)
model <- Y1 | Y2 ~ time | sm(time) | sm(lag) | sm(time) | 1
# the above defines the responses and the regression models on the left and
# right of "~", respectively
# the first model, for the mean, is a linear function of time, this is sufficient as
# the 2 responses have constant mean.
# the second model, for the variances, is a smooth function of time
# the third model, for the dependence structure, is a smooth function of lag,
# that lmrm figures out and it does not need to be computed by the user
# the fourth model, for location of the correlations, is a smooth function of time
# the fifth model, for scale of the correlations, is just an intercept model
## Not run:
m1 <- lmrm(formula = model, corr.Model = c("common", nClust = 1), data = simD2,
id = id, time = time, sweeps = 2500, burn = 500, thin = 2,
StorageDir = getwd(), seed = 1)
plot(m1)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.