Description Usage Arguments Details Value Note Author(s) References See Also Examples
twinSIR
is used to fit additive-multiplicative intensity models for
epidemics as described in Höhle (2009). Estimation is driven
by (penalized) maximum likelihood in the point process frame work. Optimization
(maximization) of the (penalized) likelihood function is performed by means of
optim
.
1 2 3 |
formula |
an object of class |
data |
an object inheriting from class |
weights |
an optional vector of weights to be used in the fitting process. Should be
|
subset |
an optional vector specifying a subset of observations to be used in the
fitting process. The subset |
knots |
numeric vector or |
nIntervals |
the number of intervals of constant log-baseline hazard. Defaults to 1, which means an overall constant log-baseline hazard will be fitted. |
lambda.smooth |
numeric, the smoothing parameter λ. By default it is 0 which
leads to unpenalized likelihood inference.
In case |
penalty |
either a single number denoting the order of the difference used to penalize the log-baseline coefficients (defaults to 1), or a more specific penalty matrix K for the parameter sub-vector β. In case of non-equidistant knots – usually the case when using quantile based knot locations – a 1st order differences penalty matrix as in Fahrmeir and Lang (2001) is available. For non-equidistant knots higher orders than one are not implemented. |
optim.args |
a list with arguments passed to the
|
model |
logical indicating if the model frame, the |
keep.data |
logical indicating if the |
A model is specified through the formula
, which has the form
~ epidemicTerm1 + epidemicTerm2 + cox(endemicVar1) *
cox(endemicVar2)
,
i.e. the right hand side has the usual form as in lm
with
some variables marked as being endemic by the special function
cox
. The left hand side of the formula is empty and will be
set internally to cbind(start, stop, event)
, which is similar to
Surv(start, stop, event, type="counting")
.
Basically, the additive-multiplicative model for the infection intensity λ_i(t) for individual i is
λ_i(t) = Y_i(t) * (e_i(t) + h_i(t))
where
is the at-risk indicator, indicating if individual i is
“at risk” of becoming infected at time point t.
This variable is part of the event history data
.
is the epidemic component of the infection intensity, defined as
e_i(t) = ∑_{j \in I(t)} f(||s_i - s_j||)
where I(t) is the set of infectious individuals just before time point t, s_i is the coordinate vector of individual i and the function f is defined as
f(u) = ∑_{m=1}^p α_m B_m(u)
with unknown transmission parameters α and known distance
functions B_m. This set of distance functions results in the set of
epidemic variables normally calculated by the converter function
as.epidata
, considering the equality
e_i(t) = ∑_{m=1}^p α_m x_{im}(t)
with x_{im}(t) = ∑_{j \in I(t)} B_m(||s_i - s_j||) being the m'th epidemic variable for individual i.
is the endemic (cox
) component of the infection intensity, defined
as
h_i(t) = \exp(h_0(t) + z_i(t)' β)
where h_0(t) is the log-baseline hazard function, z_i(t)
is the vector of endemic covariates of individual i and β
is the vector of unknown coefficients.
To fit the model, the log-baseline hazard function is approximated by a
piecewise constant function with known knots, but unknown levels,
which will be estimated. The approximation is specified by the arguments
knots
or nIntervals
.
If a big number of knots
(or nIntervals
) is chosen, the
corresponding log-baseline parameters can be rendered identifiable by
the use of penalized likelihood inference. At present, it is the job
of the user to choose an adequate value of the smoothing parameter
lambda.smooth
. Alternatively, a data driven
lambda.smooth
smoothing parameter selection based on a mixed
model representation of an equivalent truncated power spline is offered (see
reference for further details). The following two steps are iterated
until converegence:
Given fixed smoothing parameter, the penalized likelihood is optimized for the regression components using a L-BFGS-B approach
Given fixed regression parameters, a Laplace approximation of the marginal likelihood for the smoothing parameter is numerically optimized.
Depending on the data convergence might take a couple of iterations.
Note also that it is unwise to include endemic covariates with huge values,
as they affect the intensities on the exponential scale after having been
multiplied by the parameter vector β. With big covariates the
optim
method "L-BFGS-B" will likely terminate due to an infinite
log-likelihood or score function in some iteration.
twinSIR
returns an object of class
"twinSIR"
. An object of this class is a list containing the
following components:
coefficients |
a named vector of coefficients. |
loglik |
the maximum of the (penalized) log-likelihood function. |
counts |
the number of log-likelihood and score function evaluations. |
converged |
logical indicating convergence of the optimization algorithm. |
fisherinfo.observed |
if requested, the negative Hessian from
|
fisherinfo |
an estimation of the Expected Fisher Information matrix. |
method |
the optimization algorithm used. |
intervals |
a numeric vector ( |
nEvents |
a numeric vector containing the number of infections in each of
the above |
model |
if requested, the model information used. This is a list with
components |
data |
if requested, the supplied |
call |
the matched call. |
formula |
the specified |
terms |
the |
There are some restrictions to modelling the infection intensity
without a baseline hazard rate, i.e. without an intercept in the
formula
.
Reason: At some point, the optimization algorithm L-BFGS-B tries to set all
transmission parameters α to the boundary value 0 and to calculate
the (penalized) score function with this set of parameters (all 0). The problem
then is that the values of the infection intensities lambda_i(t) are 0
for all i and t and especially at observed event times, which is
impossible. Without a baseline, it is not allowed to have all alpha's set to 0,
because then we would not observe any infections. Unfortunately, L-BFGS-B can
not consider this restriction. Thus, if one wants to fit a model without
baseline hazard, the control parameter lower
must be specified in
optim.args
so that some alpha is strictly positive, e.g.
optim.args = list(lower = c(0,0.001,0.001,0))
and the initial parameter
vector par
must not be the zero vector.
Michael Höhle and Sebastian Meyer
Höhle, M. (2009), Additive-Multiplicative Regression Models for Spatio-Temporal Epidemics, Biometrical Journal, 51(6):961-978.
as.epidata
for the necessary data input structure,
plot.twinSIR
for plotting the path of the infection intensity,
profile.twinSIR
for profile likelihood estimation.
and simulate.twinSIR
for the simulation of epidemics following
the fitted model.
Furthermore, the standard extraction methods
vcov
, logLik
,
AIC
and
extractAIC
are implemented for
objects of class "twinSIR"
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | data("fooepidata")
summary(fooepidata)
# fit an overall constant baseline hazard rate
fit1 <- twinSIR(~ B1 + B2 + cox(z2), data = fooepidata)
fit1
summary(fit1)
# fit1 is what is used as data("foofit") in other examples
data("foofit")
stopifnot(all.equal(fit1, foofit))
# fit a piecewise constant baseline hazard rate with 3 intervals using
# _un_penalized ML and estimated coefs from fit1 as starting values
fit2 <- twinSIR(~ B1 + B2 + cox(z2), data = fooepidata, nIntervals = 3,
optim.args = list(par=c(coef(fit1)[1:2],rep(coef(fit1)[3],3),coef(fit1)[4])))
fit2
summary(fit2)
# fit a piecewise constant baseline hazard rate with 9 intervals
# using _penalized_ ML and estimated coefs from fit1 as starting values
fit3 <- twinSIR(~ B1 + B2 + cox(z2), data = fooepidata, nIntervals = 9,
lambda.smooth = 0.1, penalty = 1, optim.args = list(
par=c(coef(fit1)[1:2], rep(coef(fit1)[3],9), coef(fit1)[4])))
fit3
summary(fit3)
# plot of the 9 log-baseline levels
plot(x=fit3$intervals, y=coef(fit3)[c(3,3:11)], type="S")
### -> for more sophisticated intensity plots, see 'plot.twinSIR'
plot(fit3)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.