View source: R/bayessurvreg1.R
bayessurvreg1 | R Documentation |
A function to sample from the posterior distribution for a survival regression model
\log(T_{i,l}) = \beta^T x_{i,l} + b_i^T z_{i,l} +
\varepsilon_{i,l},\quad i=1,\dots,N,\ l=1,\dots,n_i,
where distribution of \varepsilon_{i,l}
is specified
as a normal mixture with unknown number of components as in Richardson
and Green (1997) and random effect b_i
is normally distributed.
See Komárek (2006) or Komárek and Lesaffre (2007) for more detailed description of prior assumptions.
Sampled values are stored on a disk to be further worked out by e.g.
coda
or boa
.
bayessurvreg1(formula, random,
data = parent.frame(), subset,
na.action = na.fail,
x = FALSE, y = FALSE, onlyX = FALSE,
nsimul = list(niter = 10, nthin = 1, nburn = 0,
nnoadapt = 0, nwrite = 10),
prior = list(kmax = 5, k.prior = "poisson", poisson.k = 3,
dirichlet.w = 1,
mean.mu = NULL, var.mu = NULL,
shape.invsig2 = 1.5,
shape.hyper.invsig2 = 0.8, rate.hyper.invsig2 = NULL,
pi.split = NULL, pi.birth = NULL,
Eb0.depend.mix = FALSE),
prior.beta, prior.b, prop.revjump,
init = list(iter = 0, mixture = NULL, beta = NULL,
b = NULL, D = NULL,
y = NULL, r = NULL, otherp = NULL, u = NULL),
store = list(y = TRUE, r = TRUE, b = TRUE, u = TRUE,
MHb = FALSE, regresres = FALSE),
dir,
toler.chol = 1e-10, toler.qr = 1e-10, ...)
formula |
model formula for the ‘fixed’ part of the model, i.e. the
part that specifies The left-hand side of the If
| ||
random |
formula for the ‘random’ part of the model, i.e. the
part that specifies When some random effects are included the random intercept is added
by default. It can be removed using e.g. | ||
data |
optional data frame in which to interpret the variables occuring in the formulas. | ||
subset |
subset of the observations to be used in the fit. | ||
na.action |
function to be used to handle any | ||
x |
if | ||
y |
if | ||
onlyX |
if TRUE, no McMC is performed. The function returns only
a design matrix of your model (intercept excluded). It might be
useful to set up correctly a parameter for a block update of
| ||
nsimul |
a list giving the number of iterations of the McMC and other parameters of the simulation.
| ||
prior |
a list that identifies prior hyperparameters and prior
choices. See accompanying paper for more details.
Some prior parameters can be guessed by the function itself. If you
want to do so, set such parameters to
| ||
prior.beta |
a list defining the blocks of
| ||
prior.b |
a list defining the way in which the random effects are to be updated and the specification of priors for random effects related parameters. The list is assumed to have following components.
| ||
prop.revjump |
a list of values defining in which way the reversible jumps will be performed.
| ||
init |
a list of the initial values to start the McMC. Set to
| ||
store |
a list that defines which sampled values besides
regression parameters
In the case that either | ||
dir |
a string that specifies a directory where all sampled values are to be stored. | ||
toler.chol |
tolerance for the Cholesky decomposition. | ||
toler.qr |
tolerance for the QR decomposition. | ||
... |
who knows? |
A list of class bayessurvreg
containing an information
concerning the initial values and prior choices.
Additionally, the following files with sampled values
are stored in a directory specified by dir
parameter of this
function (some of them are created only on request, see store
parameter of this function).
one column labeled iteration
with
indeces of McMC iterations to which the stored sampled values correspond.
two columns labeled loglik
and
randomloglik
.
\mbox{\code{loglik}} = \sum_{i=1}^{N}\sum_{l=1}^{n_i}\Biggl[
\biggl\{
\log\Bigl(\frac{1}{\sqrt{2\pi\sigma_{r_{i,l}}^2}}\Bigr)
-\frac{(y_{i,l} - \beta^T x_{i,l} - b_i^T z_{i,l} - \mu_{r_{i,l}})^2}{2\sigma_{r_{i,l}}^2}
\biggr\}
\Biggr],
where
y_{i,l}
denotes (sampled) (i,l)th true
log-event time,
b_i
sampled value of the random effect vector for the
ith cluster,
\beta
sampled value of the regression parameter
\beta
and
k, w_j, \mu_j, \sigma_j^2, j = 1,\dots,k
sampled mixture at each iteration.
\mbox{\code{randomloglik}} =
\sum_{i=1}^{N}\log\Bigl(g(b_i)\Bigr),
where g
denotes a density of
(multivariate) normal distribution
N(\gamma, D),
where
\gamma
is a sampled value of the mean of random
effect vector and D
is a sampled value of the covariance
matrix of the random effects at each iteration.
three columns labeled k
, Intercept
and
Scale
. These are the number of mixture components, mean and
standard deviation of the sampled error distribution (mixture) at
each iteration.
each row contains mixture weights
w_1,\dots,w_k
at each iteration. From the header of this file, maximal number of mixture
components specified in the prior can be derived.
each row contains mixture means
\mu_1,\dots,\mu_k
at each iteration. From the header of this file, maximal number of mixture
components specified in the prior can be derived.
each row contains mixture variances
\sigma^2_1,\dots,\sigma^2_k
at each
iteration. From the header of this file, maximal number of mixture
components specified in the prior can be derived.
columns labeled according to name of the design
matrix. These are sampled values of regression parameters
\beta
and means of random effects \gamma
(except the mean of the random intercept which is zero).
columns labeled nameb[1].id[1], ...,
nameb[q].id[1], ..., nameb[1].id[N], ..., nameb[q].id[N]
,
where q
is a dimension of the random effect vector
b_i
and N
number of clusters. nameb
is replaced by appropriate column name from the design matrix and
id
is replaced by identificator of the clusters. This gives
sampled values of the random effects for each cluster.
columns labeled det, D.s.t, s = 1,..., q, t =
s,...,q
, where q
is dimension of the random effect
vector b_i
. Column det
gives a determinant of
the covariance matrix D
of the random effects at each
iteration, remaining columns give a lower triangle of this matrix
at each iteration.
columns labeled Y[m]
where m
goes from 1
to \sum_{i=1}^{N}n_i
. This gives sampled
log-event times for each observation in the dataset at each
iteration.
columns labeled r[m]
where m
goes from 1
to \sum_{i=1}^{N}n_i
. This gives sampled
mixture labels for each observation in the dataset at each iteration.
Currently only one column labeled eta
that
gives sampled values of the hyperparameter \eta
.
this gives the information concerning the
performance of reversible jump algorithm and a sampler of
regression parameters \beta
and means of random
effects \gamma
. It has columns
accept.spl.comb
relative frequency of accepted split-combine moves up to that iteration.
split
relative frequency of proposed split moves up to that iteration.
accept.birth.death
relative frequency of accepted birth-death moves up to that iteration.
birth
relative frequency of proposed birth moves up to that iteration.
beta.block.m
with m
going from 1 to number
of defined blocks of beta parameters. This gives a relative
frequency of accepted proposals for each block up to that
iteration. When Gibbs move is used, these should be columns of
ones.
this gives the information concerning the performance of a sampler for random effects (relative frequency of accepted values for each cluster and each block of random effects updated together). When Gibbs move is used only ones are seen in this file.
Sampled values of canonical proposal variables for reversible jump algorithm are stored here. This file is useful only when trying to restart the simulation from some specific point.
columns labeled res[m]
where m
goes from 1
to \sum_{i=1}^{N}n_i
This stores so called
regression residuals for each observation at each iteration. This
residual is defined as
res_{i,l} = y_{i,l} - \beta^T x_{i,l} - b_i z_{i,l},\qquad
i=1\dots,N,\quad l=1,\dots,n_i,
where y_{i,l}
is a (sampled)
log-event time at each iteration.
Arnošt Komárek arnost.komarek@mff.cuni.cz
Komárek, A. (2006). Accelerated Failure Time Models for Multivariate Interval-Censored Data with Flexible Distributional Assumptions. PhD. Thesis, Katholieke Universiteit Leuven, Faculteit Wetenschappen.
Komárek, A. and Lesaffre, E. (2007). Bayesian accelerated failure time model for correlated interval-censored data with a normal mixture as an error distribution. Statistica Sinica, 17, 549 - 569.
Brooks, S. P., Giudici, P., and Roberts, G. O. (2003). Efficient construction of reversible jump Markov chain Monte Carlo proposal distribution (with Discussion). Journal of the Royal Statistical Society B, 65, 3 - 55.
Green, P. J. (1995). Reversible jump MCMC computation and Bayesian model determination. Biometrika, 82, 711 - 732.
Richardson, S., and Green, P. J. (1997). On Bayesian analysis of mixtures with unknown number of components (with Discussion). Journal of the Royal Statistical Society B, 59, 731 - 792.
## See the description of R commands for
## the models described in
## Komarek (2006),
## Komarek and Lesaffre (2007).
##
## R commands available
## in the documentation
## directory of this package as
## - ex-cgd.R and
## https://www2.karlin.mff.cuni.cz/~komarek/software/bayesSurv/ex-cgd.pdf
##
## - ex-tandmobMixture.R and
## https://www2.karlin.mff.cuni.cz/~komarek/software/bayesSurv/ex-tandmobMixture.pdf
##
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.