KF: Kalman Filter for State Space Models
In KFKSDS: Kalman Filter, Smoother and Disturbance Smoother

Description Usage Arguments Details Value State space representation References See Also Examples

View source: R/KF.R

These functions run the iterative equations of the Kalman filter for a state space model.

KF(y, ss, convergence = c(0.001, length(y)), t0 = 1)
KF.C(y, ss, convergence = c(0.001, length(y)), t0 = 1)
KF.deriv(y, ss, xreg = NULL, convergence = c(0.001, length(y)), t0 = 1)
KF.deriv.C(y, ss, xreg = NULL, convergence = c(0.001, length(y)), 
  t0 = 1, return.all = FALSE)

`y`	a numeric time series or vector.
`ss`	a list containing the matrices of the state space model.
`xreg`	optional matrix or list of external regressors. See details.
`convergence`	a numeric vector of length two to control and determine the convergence of the filter. See details below.
`t0`	a numeric indicating the index of the first observation at which the contributions to the likelihood are addep up.
`return.all`	logical. If `TRUE`, extended output containing elements to be used by `KS.deriv` is returned.

The implementation is a direct transcription of the iterative equations of the filter that are summarized below. Details can be found in the references given below and in many other textbooks. The source code follows the notation used in Durbin and Koopman (2001).

The elements in the argument ss must be named in accordance with the notation given below for the state space representation. For those models defined in the package stsm, a convenient way to create the argument ss is by means of the function char2numeric.

The contributions to the likelihood function of the first observations may be omitted by choosing a value of t0 greater than one.

The functions with ‘.deriv’ in the name compute the derivatives of some of the elements involved in the filter with respect to the parameters of the model.

The functions KF and KF.deriv are fully implemented in R while KF.deriv.C calls to compiled C code.

Using KF.deriv.C with return.all = FALSE returns minimal output with the elements necessary to compute the derivative of the log-likelihood function. Using return.all = TRUE further elements to be used in KS.deriv are returned.

Missing observations are allowed. If a missing value is observed after the filter has converged then all operations of the filter are run instead of using steady state values until convergence is detected again.

Parameters to control the convergence of the filter. In some models, the Kalman filter may converge to a steady state. Finding the explicit expression of the steady state values can be cumbersome in some models. Alternatively, at each iteration of the filter it can be checked whether a steady state has been reached. For it, some control parameters can be defined in the argument convergence. It is considered that convergence was reached when the following is observed: the change in the variance of the prediction error over the last convergence[2] consecutive iterations of the filter is below the tolerance value convergence[1]. When the steady state is reached, the values from the last iteration are used in the remaining iteration of the filter. Thus, the number of matrix operations can be reduced substantially as pointed in Harvey (1989) Sec. 3.3.4.

External regressors. A matrix of external regressors can be passed in argument xreg. If xreg is a matrix then it is assumed that the time series passed in argument y has been already adjusted for the effect of these regressors, that is, y_t^{adj} = y_t - X γ. If y is the observed series, then xreg should be a list containing the following elements: xreg, the matrix of regressors; and coefs, the vector of coefficients, γ, related to the regressors. The coefficients must be placed in coefs in the same order as the corresponding vectors are arranged by columns in xreg.

The number of rows of the matrix of regressors must be equal to the length of the series y.

Column names are necessary for KF.deriv and are optional for KF.deriv.C.

A list containing the following elements:

`v`	prediction error.
`f`	variance of the prediction error.
`K`	Kalman gain.
`L`	auxiliar matrices to be used by the smoother.
`a.pred`	one step ahead prediction of the state vector.
`P.pred`	covariance matrix of `a.pred`.
`a.upd`	update of `a.pred` after the observation at time t that is predicted in `a.pred` enters in the recursive filter.
`P.upd`	update of `P.pred`.
`convit`	the iteration at which the filter converged. If convergence was not observed it is set to `NULL`.
`mll`	value of the negative of the log-likelihood function.

The function KF.C is a simplified and faster version that records and returns only the value of negative of the log-likelihood function. It is suited to be passed as argument to optim in the stats package.

The functions that evaluate the derivatives include in the output the derivatem terms: da.pred, dP.pred, da.upd, dP.upd, dv, df, dvof (derivative of quotient v over f), dK and dL.

KF.deriv.C does not return a.upd and P.upd and their derivative terms da.upd and dP.upd. If return.all = TRUE, this function returns: dvof, dL, da.pred, dP.pred, which are the derivative terms necessary to evaluate the gradient of the log-likelihood function. By default they are not returned, return.all = FALSE. They are in any case computed, the operations that are omitted in the latter case is the arrangement of the output from the call to compiled C code into matrices of the proper dimension containing the data in the right order.

Derivatives of the likelihood function are implemented in package stsm. Although the Kalman filter provides information to evaluate the likelihood function, it is not its primary objective. That's why the derivatives of the likelihood are currently part of the package stsm, which is specific to likelihood methods in structural time series models.

The general univariate linear Gaussian state space model is defined as follows:

y[t] = Za[t] + e[t], e[t] \sim N(0, H)

a[t+1] = Ta[t] + Rw[t], w[t] \sim N(0, V)

for t=1,…,n and a[1] \sim N(a0, P0). Z is a matrix of dimension 1xm; H is 1x1; T is mxm; R is mxr; V is rxr; a0 is mx1 and P0 is mxm, where m is the dimension of the state vector a and r is the number of variance parameters in the state vector.

The Kalman filtering recursions for the model above are:

Prediction

a[t] = T a[t-1]

P[t] = T P[t-1] T' + R V R'

v[t] = y[t] - Z a[t]

F[t] = Z P[t] Z' + H

Updating

K[t] = P[t] Z' F[t]^{-1}

a[t] = a[t] + K[t] v[t]

P[t] = P[t] - K[t] Z P[t]'

for t=2,…,n, starting with a[1] and P[1] equal to a0 and P0. v[t] is the prediction error at observation in time t and F[t] is the variance of v[t].

Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State Space Methods. Oxford University Press.

Harvey, A. C. (1989). Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge University Press.

char2numeric in package stsm.

# local level plus seasonal model with arbitrary parameter values
# for the 'JohnsonJohnson' time series
m <- stsm::stsm.model(model = "llm+seas", y = JohnsonJohnson, 
  pars = c("var1" = 2, "var2" = 15, "var3" = 30))
ss <- stsm::char2numeric(m)

# run the Kalman filter
kf <- KF(m@y, ss)
plot(kf$a.upd[,1:2], main = "filtered state vector")

# 'KF.C' is a faster version that returns only the 
# value of the negative of the likelihood function
kfc <- KF.C(m@y, ss)
all.equal(kf$mll, kfc)

# compute also derivative terms used below
kfd <- KF.deriv(m@y, ss)
all.equal(kfc, kfd$mll)
kfdc <- KF.deriv.C(m@y, ss, return.all = TRUE)
all.equal(kf$mll, kfdc$mll)

# as expected the versions that use compiled C code
# are faster that the versions written fully in R
# e.g. not including derivatives
## Not run: 
system.time(for(i in seq_len(10)) kf <- KF(m@y, ss))
system.time(for(i in seq_len(10)) kfc <- KF.C(m@y, ss))
# e.g. including derivatives
system.time(for(i in seq_len(10)) kfd <- KF.deriv(m@y, ss))
system.time(for(i in seq_len(10)) kfdc <- KF.deriv.C(m@y, ss, return.all = TRUE))

## End(Not run)

# compare analytical and numerical derivatives
# they give same results up to a tolerance error
fcn <- function(x, model, type = c("v", "f"))
{
  m <- stsm::set.pars(model, x)
  ss <- stsm::char2numeric(m)
  kf <- KF(m@y, ss)
  switch(type, "v" = sum(kf$v), "f" = sum(kf$f))
}

dv <- numDeriv::grad(func = fcn, x = m@pars, model = m, type = "v")
all.equal(dv, colSums(kfd$dv), check.attributes = FALSE)
all.equal(dv, colSums(kfdc$dv), check.attributes = FALSE)
df <- numDeriv::grad(func = fcn, x = m@pars, model = m, type = "f")
all.equal(df, colSums(kfd$df), check.attributes = FALSE)
all.equal(df, colSums(kfdc$df), check.attributes = FALSE)

# compare timings in version written in R with numDeriv::grad
# no calls to compiled C code in either case
# looking at these timings, using analytical derivatives is
# expected to be useful in optimization algorithms
## Not run: 
system.time(for (i in seq_len(10)) 
  numdv <- numDeriv::grad(func = fcn, x = m@pars, model = m, type = "v"))
system.time(for(i in seq_len(10)) kfdv <- colSums(KF.deriv(m@y, ss)$dv))

## End(Not run)

# compare timings when convergence is not checked with the case 
# when steady state values are used after convergence is observed
# computation time is reduced substantially
## Not run: 
n <- length(m@y)
system.time(for(i in seq_len(20)) a <- KF.deriv(m@y, ss, convergence = c(0.001, n)))
system.time(for(i in seq_len(20)) b <- KF.deriv(m@y, ss, convergence = c(0.001, 10)))
# the results are the same up to a tolerance error
all.equal(colSums(a$dv), colSums(b$dv))

## End(Not run)

[1] TRUE
[1] TRUE
[1] TRUE
   user  system elapsed 
  0.090   0.004   0.093 
   user  system elapsed 
  0.011   0.000   0.012 
   user  system elapsed 
  0.304   0.005   0.308 
   user  system elapsed 
  0.127   0.000   0.127 
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
   user  system elapsed 
  0.738   0.000   0.738 
   user  system elapsed 
  0.181   0.000   0.180 
   user  system elapsed 
  0.327   0.000   0.328 
   user  system elapsed 
  0.216   0.000   0.216 
[1] TRUE