hhh4ZI: Fitting zero-inflated HHH Models with Random Effects and...

View source: R/hhh4ZI.R

hhh4ZIR Documentation

Fitting zero-inflated HHH Models with Random Effects and Neighbourhood Structure

Description

Fits a zero-inflated autoregressive negative binomial (hhh4) model to a univariate or multivariate time series of counts. The characteristic feature of hhh4 models is the additive decomposition of the conditional mean into epidemic and endemic components (Held et al, 2005). The inflated parameter is a logit-linear predictor and can have autoregressive terms.

Usage

hhh4ZI(
  stsObj,
  control = list(ar = list(f = ~-1, offset = 1, lag = 1), ne = list(f = ~-1, offset = 1,
    lag = 1, weights = neighbourhood(stsObj) == 1, scale = NULL, normalize = FALSE), end
    = list(f = ~1, offset = 1), zi = list(f = ~1, lag = 1, lag.unitSpecific = FALSE),
    family = c("NegBin1", "NegBinM"), subset = 2:nrow(stsObj), optimizer = list(stop =
    list(tol = 1e-05, niter = 100), regression = list(method = "nlminb"), variance =
    list(method = "Nelder-Mead")), verbose = FALSE, start = list(fixed = NULL, random =
    NULL, sd.corr = NULL), 
     data = list(t = stsObj@epoch - min(stsObj@epoch)),
    keep.terms = FALSE),
  check.analyticals = FALSE
)

Arguments

stsObj

object of class "sts" containing the (multivariate) count data time series.

control

a list containing the model specification and control arguments, the parts relating to hhh4 model is the same as in surveillance::hhh4:

  • ar Model for the autoregressive component given as list with the following components:

    • f = ~ -1 a formula specifying \log(\lambda_{it}).

    • offset = 1 optional multiplicative offset, either 1 or a matrix of the same dimension as observed(stsObj).

    • lag = 1 a positive integer meaning autoregression on y_{i,t-lag}.

  • ne Model for the neighbour-driven component given as list with the following components:

    • f = ~ -1 a formula specifying \log(\phi_{it}).

    • offset = 1 optional multiplicative offset, either 1 or a matrix of the same dimension as observed(stsObj).

    • lag = 1 a non-negative integer meaning dependency on y_{j,t-lag}.

    • weights = neighbourhood(stsObj) == 1 neighbourhood weights w_{ji}. The default corresponds to the original formulation by Held et al (2005), i.e., the spatio-temporal component incorporates an unweighted sum over the lagged cases of the first-order neighbours. See Paul et al (2008) and Meyer and Held (2014) for alternative specifications, e.g., W_powerlaw. Time-varying weights are possible by specifying an array of dim() c(nUnits, nUnits, nTime), where nUnits=ncol(stsObj) and nTime=nrow(stsObj).

    • scale = NULL optional matrix of the same dimensions as weights (or a vector of length ncol(stsObj)) to scale the weights to scale * weights.

    • normalize = FALSE logical indicating if the (scaled) weights should be normalized such that each row sums to 1.

  • end Model for the endemic component given as list with the following components

    • f = ~ 1 a formula specifying \log(\nu_{it}).

    • offset = 1 optional multiplicative offset e_{it}, either 1 or a matrix of the same dimension as observed(stsObj).

  • zi Model for the zero inflation component given as list with the following components:

    • f = ~ -1 a formula specifying logit(\gamma_{it}).

    • lag = 1 a vector of positive integers meaning autoregression on the respective y_{i,t-lag} terms. Use NULL or integer(0) to exclude AR terms.

    • lag.unitSpecific logical indicating if the autoregressive parameter in the zero inflation part is unit specific.

  • family Distributional family – the Negative Binomial distribution. The overdispersion parameter can be assumed to be the same for all units ("NegBin1"), to vary freely over all units ("NegBinM"), or to be shared by some units (specified by a factor of length ncol(stsObj) such that its number of levels determines the number of overdispersion parameters). Note that "NegBinM" is equivalent to factor(colnames(stsObj), levels = colnames(stsObj)).

  • subset Typically 2:nrow(obs) if model contains autoregression.

  • optimizer a list of three lists of control arguments.

    The "stop" list specifies two criteria for the outer optimization of regression and variance parameters: the relative tolerance for parameter change using the criterion max(abs(x[i+1]-x[i])) / max(abs(x[i])), and the maximum number niter of outer iterations.

    Control arguments for the single optimizers are specified in the lists named "regression" and "variance". method="nlminb" is the default optimizer for regression update, however, the methods from optim may also be specified (as well as "nlm" but that one is not recommended here). For the variance updates, only Nelder-Mead optimization (method="Nelder-Mead") is provided. All other elements of these two lists are passed as control arguments to the chosen method, e.g., if method="nlminb" adding iter.max=50 increases the maximum number of inner iterations from 20 (default) to 50.

  • verbose non-negative integer (usually in the range 0:3) specifying the amount of tracing information to be output during optimization.

  • start a list of initial parameter values replacing initial values set via fe and ri. Since surveillance 1.8-2, named vectors are matched against the coefficient names in the model (where unmatched start values are silently ignored), and need not be complete, e.g., start = list(fixed = c("-log(overdisp)" = 0.5)) (default: 2) for a family = "NegBin1" model. In contrast, an unnamed start vector must specify the full set of parameters as used by the model.

  • data a named list of covariates that are to be included as fixed effects (see fe) in any of the 3 component formulae. By default, the time variable t is available and used for seasonal effects created by addSeason2formula. In general, covariates in this list can be either vectors of length nrow(stsObj) interpreted as time-varying but common across all units, or matrices of the same dimension as the disease counts observed(stsObj).

  • keep.terms logical indicating if the terms object used in the fit is to be kept as part of the returned object. This is usually not necessary, since the terms object is reconstructed by the terms-method for class "hhh4ZI" if necessary (based on stsObj and control, which are both part of the returned "hhh4ZI" object).

check.analyticals

logical (or a subset of c("numDeriv", "maxLik")), indicating if (how) the implemented analytical score vector and Fisher information matrix should be checked against numerical derivatives at the parameter starting values, using the packages numDeriv and/or maxLik. If activated, hhh4 will return a list containing the analytical and numerical derivatives for comparison (no ML estimation will be performed). This is mainly intended for internal use by the package developers.

Value

hhh4ZI returns an object of class "hhh4ZI", which inherits from class "hhh4", and is a list containing the following components:

  • coefficients named vector with estimated (regression) parameters of the model

  • se estimated standard errors (for regression parameters)

  • cov covariance matrix (for regression parameters)

  • Sigma estimated variance-covariance matrix of random effects

  • Sigma.orig estimated variance parameters on internal scale used for optimization

  • call the matched call

  • dim vector with number of fixed and random effects in the model

  • loglikelihood (penalized) loglikelihood evaluated at the MLE

  • margll (approximate) log marginal likelihood should the model contain random effects

  • convergence logical. Did optimizer converge?

  • mu The fitted mean values in hhh4 model part

  • fitted.values fitted mean values in zero-inflated model

  • gamma fitted zero inflation parameter

  • control control object of the fit

  • terms the terms object used in the fit if keep.terms = TRUE and NULL otherwise

  • stsObj the supplied stsObj

  • lags named integer vector of length three containing the lags used for the epidemic components "ar", "ne" and "zi" respectively. The corresponding lag is NA if the component was not included in the model.

  • nObs number of observations used for fitting the model

  • nTime number of time points used for fitting the model

  • nUnit number of units (e.g. areas) used for fitting the model

  • runtime the proc.time-queried time taken to fit the model, i.e., a named numeric vector of length 5 of class "proc_time"

Examples

neW1 <- neighbourhood(measles) == 1
fit <- hhh4ZI(measles,
  control = list(
    ar = list(f = ~1),
    ne = list(f = ~1, weights = neW1, normalize = TRUE),
   end = list(f = ~1),
    zi = list(f = ~1),
    family = "NegBin1",
    verbose = TRUE,
    keep.terms = TRUE
  )
)
summary(fit)
sim_data <- simulate(fit, simplify = FALSE)

Junyi-L/hhh4ZI documentation built on Oct. 14, 2024, 11:45 p.m.