Count Time Series Following Generalised Linear Models
Description
The function tsglm
fits a generalised linear model (GLM) for time series of counts.
The specification of the linear predictor allows for regressing on past observations, past values of the linear predictor and covariates as defined in the Details section.
There is the socalled INGARCH model with the identity link (see for example Ferland et al., 2006, Fokianos et al., 2009) and another model with the logarithmic link (see for example Fokianos and Tjostheim, 2011), which also differ in the specification of the linear predictor.
The conditional distribution can be chosen to be either Poisson or negative binomial.
Estimation is done by conditional maximum likelihood for the Poisson distribution or by a conditional quasilikelihood approach based on the Poisson likelihood function for the negative binomial distribution.
There is a vignette available which introduces the functionality of tsglm
and related functions of this package and its underlying statistical methods (vignette("tsglm", package="tscount")
).
The function mean.fit
is a lower level function to fit the mean specification of such a model assuming a Poisson distribution. It is called by tsglm
. It has additional arguments allowing for a finer control of the fitting procedure, which can be handed over from the function tsglm
by its ...
argument. Note that it is usually not necessary for a user to call this lower level functions nor to worry about the additional arguments provided by this function. The defaults of these arguments have been chosen wisely by the authors of this package and should perform well in most applications.
Usage
1 2 3 4 5 6 7 8 9 10  tsglm(ts, model = list(past_obs = NULL, past_mean = NULL,
external = NULL), xreg = NULL, link = c("identity", "log"),
distr = c("poisson", "nbinom"), ...)
mean.fit(ts, model, xreg, link, score = TRUE,
info = c("score", "none", "hessian", "sandwich"),
init.method=c("marginal", "iid", "firstobs", "zero"),
init.drop = FALSE, epsilon = 1e06, slackvar = 1e06,
start.control = list(), final.control = list(),
inter.control = NULL)

Arguments
ts 
a univariate time series. 
model 
a named list specifying the model for the linear predictor, which can be of the following elements:

xreg 
matrix with covariates in the columns, i.e. its number of rows must be 
link 
character giving the link function. Default is 
distr 
character giving the conditional distribution. Default is 
... 
additional arguments to be passed to the lower level fitting function 
score 
logical value indicating whether the score vector should be computed. 
info 
character that determines if and how to compute the information matrix. Can be set to 
init.method 
character that determines how the recursion of the conditional mean (and possibly of its derivatives) is initialised. If set to 
init.drop 
logical value that determines which observations are considered for computation of the loglikelihood, the score vector and, if applicable, the information matrix. If 
epsilon 
numeric positive but small value determining how close the parameters may come to the limits of the parameter space. 
slackvar 
numeric positive but small value determining how true inequalities among the parameter restrictions are treated; a true inequality 
start.control 
named list with optional elements that determine how to make the start estimation. Possible list elements are:

final.control 
named list with optional elements that determine how to make the final maximum likelihood estimation. If

inter.control 
named list determining how to maximise the loglikelihood function in a first step. This intermediate optimisation will start from the start estimation and be followed by the final optimisation, which will in turn start from the intermediate optimisation result. This intermediate optimisation is intended to use a very quick but imprecise optimisation algorithm. Possible elements are the same as for 
Details
The INGARCH model (argument link="identity"
) used here follows the definition
Z[t]F[t1] ~ Poi(ν[t]) or Z[t]F[t1] ~ NegBin(ν[t], φ),
where F[t1] denotes the history of the process up to time t1, Poi and NegBin is the Poisson respectively the negative binomial distribution with the parametrisation as specified below. For the model with covariates having an internal effect (the default) the linear predictor of the INGARCH model (which is in that case identical to the conditional mean) is given by
ν[t] = β[0] + β[1] Z[ti[1]] + … + β[p] Z[ti[p]] + α[1] ν[tj[1]] + … + α[q] ν[tj[q]] + η[1] X[t,1] + … + η[r] X[t,r].
The loglinear model (argument link="log"
) used here follows the definition
Z[t]F[t1] ~ Poi(λ[t]) or Z[t]F[t1] ~ NegBin(λ[t], φ),
with λ[t] = \exp(ν[t]) and F[t1] as above. For the model with covariates having an internal effect (the default) the linear predictor ν[t] = \log(λ[t]) of the loglinear model is given by
ν[t] = β[0] + β[1] \log(Z[ti[1]]+1) + … + β[p] \log(Z[ti[p]]+1) + α[1] ν[tj[1]] + … + α[q] ν[tj[q]] + η[1] X[t,1] + … + η[r] X[t,r].
Note that because of the logarithmic link function the effect of single summands in the linear predictor on the conditional mean is multiplicative and hence the parameters play a different role than in the INGARCH model, although they are denoted by the same letters.
The Poisson distribution is parametrised by the mean lambda
according to the definition in Poisson
.
The negative binomial distribution is parametrised by the mean mu
with an additional dispersion parameter size
according to the definition in NegBinomial
. In the notation above its mean parameter mu
is ν[t] and its dispersion parameter size
is φ.
This function allows to include covariates in two different ways. A covariate can have a socalled internal effect as defined above, where its effect propagates via the regression on past values of the linear predictor and on past observations. Alternatively, it can have a socalled external effect, where its effect does not directly propagates via the feedback on past values of the linear predictor, but only via past observations. For external effects of the covariates, the linear predictor for the model with identity link is given by
ν[t] = μ[t] + η[1] X[t,1] + … + η[r] X[t,r],
μ[t] = β[0] + β[1] Z[ti[1]] + … + β[p] Z[ti[p] + α[1] μ[tj[1]] + … + α[q] μ[tj[q]],
and analoguesly for the model with logarithmic link by
ν[t] = μ[t] + η[1] X[t,1] + … + η[r] X[t,r],
μ[t] = β[0] + β[1] \log(Z[ti[1]]+1) + … + β[p] \log(Z[ti[p]+1) + α[1] μ[tj[1]] + … + α[q] μ[tj[q]].
This is described in more detail by Liboschik et al. (2014) for the case of deterministic covariates for modelling interventions. It is also possible to model a combination of external and internal covariates, which can be defined straightforwardly by adding each covariate either to the linear predictor ν[t] itself (for an internal effect) or to μ[t] defined above (for an external effect).
Value
An object of class "tsglm"
, which is a list with at least the following elements:
coefficients 
a named vector of the maximum likelihood estimated coefficients, which can be extracted by the 
start 
a named vector of the start estimation for the coefficients. 
residuals 
a vector of residuals, which can be extracted by the 
fitted.values 
the fitted values, which can be extracted by the 
linear.predictors 
the linear fit on link scale. 
response 
a vector of the response values (this is usually the original time series but possibly without the first few observations used for initialization if argument 
logLik 
the loglikelihood of the fitted model, which can be extracted by the 
score 
the score vector at the maximum likelihood estimation. 
info.matrix 
the information matrix at the maximum likelihood estimation assuming a Poisson distribution. 
info.matrix_corrected 
the information matrix at the maximum likelihood estimation assuming the distribution specified in 
call 
the matched call. 
n_obs 
the number of observations. 
n_eff 
the effective number of observations used for maximum likelihood estimation (might be lower than 
ts 
the original time series. 
model 
the model specification. 
xreg 
the given covariates. 
distr 
a character giving the fitted conditional distribution. 
distrcoefs 
a named vector of the estimated additional coefficients specifying the conditional distribution. Is 
sigmasq 
the estimated overdispersion coefficient. Is zero in case of a Poisson distribution. 
The functions ingarch.fit
and loglin.fit
have the same output except the elements distr
, distrcoefs
and sigmasq
. In addition, they return the following list elements:
inter 
some details on the intermediate estimation of the coefficients as returned by 
final 
some details on the final estimation of the coefficients as returned by 
durations 
named vector of the durations of the model fit (in seconds). 
outerscoreprod 
array of outer products of score vectors at each time point. 
Author(s)
Tobias Liboschik, Philipp Probst, Konstantinos Fokianos and Roland Fried
References
Christou, V. and Fokianos, K. (2014) Quasilikelihood inference for negative binomial time series models. Journal of Time Series Analysis 35(1), 55–78, http://dx.doi.org/10.1002/jtsa.12050.
Christou, V. and Fokianos, K. (2015) Estimation and testing linearity for nonlinear mixed poisson autoregressions. Electronic Journal of Statistics 9, 1357–1377, http://dx.doi.org/10.1214/15EJS1044.
Ferland, R., Latour, A. and Oraichi, D. (2006) Integervalued GARCH process. Journal of Time Series Analysis 27(6), 923–942, http://dx.doi.org/10.1111/j.14679892.2006.00496.x.
Fokianos, K. and Fried, R. (2010) Interventions in INGARCH processes. Journal of Time Series Analysis 31(3), 210–225, http://dx.doi.org/10.1111/j.14679892.2010.00657.x.
Fokianos, K., and Fried, R. (2012) Interventions in loglinear Poisson autoregression. Statistical Modelling 12(4), 299–322. http://dx.doi.org/10.1177/1471082X1201200401.
Fokianos, K., Rahbek, A. and Tjostheim, D. (2009) Poisson autoregression. Journal of the American Statistical Association 104(488), 1430–1439, http://dx.doi.org/10.1198/jasa.2009.tm08270.
Fokianos, K. and Tjostheim, D. (2011) Loglinear Poisson autoregression. Journal of Multivariate Analysis 102(3), 563–578, http://dx.doi.org/10.1016/j.jmva.2010.11.002.
Liboschik, T., Kerschke, P., Fokianos, K. and Fried, R. (2014) Modelling interventions in INGARCH processes. International Journal of Computer Mathematics (published online), http://dx.doi.org/10.1080/00207160.2014.949250.
See Also
S3 methods print
, summary
, residuals
, plot
, fitted
, coef
, predict
, logLik
, vcov
, AIC
, BIC
and QIC
for the class "tsglm"
.
The S3 method se
computes the standard errors of the parameter estimates.
Additionally, there are the S3 methods pit
, marcal
and scoring
for predictive model assessment.
S3 methods interv_test
, interv_detect
and interv_multiple
for tests and detection procedures for intervention effects.
tsglm.sim
for simulation from GLMtype model for time series of counts. ingarch.mean
, ingarch.var
and ingarch.acf
for calculation of analytical mean, variance and autocorrelation function of an INGARCH model (i.e. with identity link) without covariates.
Example time series of counts are campy
, ecoli
, ehec
, influenza
, measles
in this package, polio
in package gamlss.data
.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17  ###Campylobacter infections in Canada (see help("campy"))
interventions < interv_covariate(n=length(campy), tau=c(84, 100),
delta=c(1, 0)) #detected by Fokianos and Fried (2010, 2012)
#Linear link function with Negative Binomial distribution:
campyfit < tsglm(campy, model=list(past_obs=1, past_mean=13),
xreg=interventions, distr="nbinom")
campyfit
plot(campyfit)
###Road casualties in Great Britain (see help("Seatbelts"))
timeseries < Seatbelts[, "VanKilled"]
regressors < cbind(PetrolPrice=Seatbelts[, c("PetrolPrice")],
linearTrend=seq(along=timeseries)/12)
#Logarithmic link function with Poisson distribution:
seatbeltsfit < tsglm(ts=timeseries, link="log",
model=list(past_obs=c(1, 12)), xreg=regressors, distr="poisson")
summary(seatbeltsfit)
