fitted_marssMLE: Return fitted values for X(t) and Y(t) in a MARSS model

fitted.marssMLER Documentation

Return fitted values for X(t) and Y(t) in a MARSS model

Description

fitted() returns the different types of fitted values for \mathbf{x}_t and \mathbf{y}_t in a MARSS model. The fitted values are the expected value of the right side of the MARSS equations without the error terms, thus are the model predictions of \mathbf{y}_t or \mathbf{x}_t. fitted.marssMLE is a companion function to tsSmooth() which returns the expected value of the right side of the MARSS equations with the error terms (the Kalman filter and smoother output).

\mathbf{x}_{t} = \mathbf{B} \mathbf{x}_{t-1} + \mathbf{u} + \mathbf{C} \mathbf{c}_t + \mathbf{G} \mathbf{w}_t, \textrm{ where } \mathbf{W}_t \sim \textrm{MVN}(0,\mathbf{Q})

\mathbf{y}_t = \mathbf{Z} \mathbf{x}_t + \mathbf{a} + \mathbf{D} \mathbf{d}_t + \mathbf{H} \mathbf{v}_t, \textrm{ where } \mathbf{V}_t \sim \textrm{MVN}(0,\mathbf{R})

The data go from t=1 to t=T. For brevity, the parameter matrices are shown without a time subscript, however all parameters can be time-varying.

Note that the prediction of \mathbf{x}_t conditioned on the data up to time t is not provided since that would require the expected value of \mathbf{X}_{t} conditioned on data from t = 1 to t+1, which is not output from the Kalman filter or smoother.

Usage

## S3 method for class 'marssMLE'
fitted(object, ..., 
    type = c("ytt1", "ytT", "xtT", "ytt", "xtt1"),   
    interval = c("none", "confidence", "prediction"), 
    level = 0.95, 
    output = c("data.frame", "matrix"), 
    fun.kf = c("MARSSkfas", "MARSSkfss"))
  

Arguments

object

A marssMLE object.

type

If type="tT", then the predictions are conditioned on all the data. If type="tt", then the predictions are conditioned on data up to time t. If type="tt1", the predictions are conditioned on data up to time t-1. The latter are also known as one-step-ahead estimates. For \mathbf{y}, these are also known as the innovations.

interval

If interval="confidence", then the standard error and confidence interval of the predicted value is returned. If interval="prediction", then the standard deviation and prediction interval of new data or states are returned.

level

Level for the intervals if interval is not equal to "none".

output

data frame or list of matrices

fun.kf

By default, tsSmooth() will use the Kalman filter/smoother function in object$fun.kf (either MARSSkfas() or MARSSkfss()). You can pass in fun.kf to force a particular Kalman filter/smoother function to be used.

...

Not used.

Details

In the state-space literature, the two most commonly used fitted values are "ytt1" and "ytT". The former is the expected value of \mathbf{Y}_t conditioned on the data 1 to time t-1. These are known as the innovations and they, plus their variance, are used in the calculation of the likelihood of a MARSS model via the innovations form of the likelihood. The latter, "ytT" are the model estimates of the \mathbf{y} values using all the data; this is the expected value of \mathbf{Z}\mathbf{X}_t+\mathbf{a}+\mathbf{D}\mathbf{d}_t conditioned on the data 1 to T. The "xtT" along with "ytT" are used for computing smoothation residuals used in outlier and shock detection. See MARSSresiduals. For completeness, fitted.marssMLE will also return the other possible model predictions with different conditioning on the data (1 to t-1, t, and T), however only type="ytt1" (innovations) and "ytT" and "xtT" (smoothations) are regularly used. Keep in mind that the fitted "xtT" is not the smoothed estimate of \mathbf{x} (unlike "ytT"). If you want the smoothed estimate of \mathbf{x} (i.e. the expected value of \mathbf{X}_t conditioned on all the data), you want the Kalman smoother. See tsSmooth.

Fitted versus estimated values: The fitted states at time t are predictions from the estimated state at time t-1 conditioned on the data: expected value of \mathbf{B}\mathbf{X}_{t-1}+\mathbf{u}+\mathbf{C}\mathbf{c}_t conditioned on the data. They are distinguished from the estimated states at time t which would includes the conditional expected values of the error terms: \textrm{E}[\mathbf{X}_{t}] = \mathbf{B}\mathbf{X}_{t-1}+\mathbf{u}+\mathbf{C}\mathbf{c}_t + \mathbf{w}_t. The latter are returned by the Kalman filter and smoother. Analogously, the fitted observations at time t are model predictions from the estimated state at time t conditioned on the data: the expected value of the right side of the \mathbf{y} equation without the error term. Like with the states, one can also compute the expected value of the observations at time t conditioned on the data: the expected value of the right side of the \mathbf{y} equation with the error term. The latter would just be equal to the data if there are no missing data, but when there are missing data, this is what is used to estimate their values. The expected value of states and observations are provided via tsSmooth.

observation fitted values

The model predicted \hat{\mathbf{y}}_t is \mathbf{Z}\mathbf{x}_t^\tau+\mathbf{a} + \mathbf{D}\mathbf{d}_t, where \mathbf{x}_t^\tau is the expected value of the state at time t conditioned on the data from 1 to \tau (\tau will be t-1, t or T). Note, if you are using MARSS for estimating the values for missing data, then you want to use tsSmooth() with type="ytT" not fitted(..., type="ytT").

\mathbf{x}_t^\tau is the expected value of the states at time t conditioned on the data from time 1 to \tau. If type="ytT", the expected value is conditioned on all the data, i.e. xtT returned by MARSSkf(). If type="ytt1", then the expected value uses only the data up to time t-1, i.e. xtt1 returned by MARSSkf(). These are commonly known as the one step ahead predictions for a state-space model. If type="ytt", then the expected value uses the data up to time t, i.e. xtt returned by MARSSkf().

If interval="confidence", the standard error and interval is for the predicted \mathbf{y}. The standard error is \mathbf{Z} \mathbf{V}_t^\tau \mathbf{Z}^\top. If interval="prediction", the standard deviation of new iid \mathbf{y} data sets is returned. The standard deviation of new \mathbf{y} is \mathbf{Z} \mathbf{V}_t^\tau \mathbf{Z}^\top + \mathbf{R}_t. \mathbf{V}_t^\tau is conditioned on the data from t=1 to n. \tau will be either t, t-1 or T depending on the value of type.

Intervals returned by predict() are not for the data used in the model but rather new data sets. To evaluate the data used to fit the model for residuals analysis or analysis or model inadequacy, you want the model residuals (and residual se's). Use residuals for model residuals and their intervals. The intervals for model residuals are narrower because the predictions for \mathbf{y} were estimated from the model data (so is closer to the data used to estimate the predictions than new independent data will be).

state fitted values

The model predicted \mathbf{x}_t given \mathbf{x}_{t-1} is \mathbf{B}\mathbf{x}_{t-1}^\tau+\mathbf{u}+\mathbf{C}\mathbf{c}_t. If you want estimates of the states, rather than the model predictions based on \mathbf{x}_{t-1}, go to tsSmooth(). Which function you want depends on your objective and application.

\mathbf{x}_{t-1}^\tau used in the prediction is the expected value of the states at time t-1 conditioned on the data from t=1 to t=\tau. If type="xtT", this is the expected value at time t-1 conditioned on all the data, i.e. xtT[,t-1] returned by MARSSkf(). If type="xtt1", it is the expected value conditioned on the data up to time t-1, i.e. xtt[,t-1] returned by MARSSkf(). The predicted state values conditioned on data up to t are not provided. This would require the expected value of states at time t conditioned on data up to time t+1, and this is not output by the Kalman filter. Only conditioning on data up to t-1 and T are output.

If interval="confidence", the standard error of the predicted values (meaning the standard error of the expected value of \mathbf{X}_t given \mathbf{X}_{t-1}) is returned. The standard error of the predicted value is \mathbf{B} \mathbf{V}_{t-1}^\tau\mathbf{B}^\top. If interval="prediction", the standard deviation of \mathbf{X}_t given \mathbf{X}_{t-1} is output. The latter is \mathbf{B} \mathbf{V}_{t-1}^\tau \mathbf{B}^\top + \mathbf{Q} . \mathbf{V}_{t-1}^\tau is either conditioned on data 1 to \tau=T or \tau=t-1 depending on type.

The intervals returned by fitted.marssMLE() are for the state predictions based on the state estimate at t-1. These are not typically what one uses or needs–however might be useful for simulation work. If you want confidence intervals for the state estimates, those are provided in tsSmooth. If you want to do residuals analysis (for outliers or model inadequacy), you want the residuals intervals provided in residuals().

Value

If output="data.frame" (the default), a data frame with the following columns is returned. If output="matrix", the columns are returned as matrices in a list. Information computed from the model has a leading "." in the column name.

If interval="none", the following are returned (colnames with . in front are computed values):

.rownames

Names of the data or states.

t

Time step.

y

The data if type is "ytT", "ytt" or "ytt1".

.x

The expected value of \mathbf{X}_t conditioned on all the data if type="xtT" or data up to time t if type="xtt1". From tsSmooth(). This is the expected value of the right-side of the \mathbf{x}_t equation with the errors terms while .fitted is the expected value of the right side without the error term \mathbf{w}_t.

.fitted

Predicted values of observations (\mathbf{y}) or the states (\mathbf{x}). See details.

If interval = "confidence", the following are also returned:

.se

Standard errors of the predictions.

.conf.low

Lower confidence level at alpha = 1-level. The interval is approximated using qnorm(alpha/2)*.se + .fitted

.conf.up

Upper confidence level. The interval is approximated using qnorm(1-alpha/2)*.se + .fitted

The confidence interval is for the predicted value, i.e. \mathbf{Z}\mathbf{x}_t^\tau+\mathbf{a} for \mathbf{y} or \mathbf{B}\mathbf{x}_{t-1}^\tau+\mathbf{u} for \mathbf{x} where \mathbf{x}_t^\tau is the expected value of \mathbf{X}_t conditioned on the data from 1 to \tau. (\tau will be t-1, t or T).

If interval = "prediction", the following are also returned:

.sd

Standard deviation of new \mathbf{x}_t or \mathbf{y}_t iid values.

.lwr

Lower range at alpha = 1-level. The interval is approximated using qnorm(alpha/2)*.sd + .fitted

.upr

Upper range at level. The interval is approximated using qnorm(1-alpha/2)*.sd + .fitted

The prediction interval is for new \mathbf{x}_t or \mathbf{y}_t. If you want to evaluate the observed data or the states estimates for outliers then these are not the intervals that you want. For that you need the residuals intervals provided by residuals().

Author(s)

Eli Holmes, NOAA, Seattle, USA.

See Also

MARSSkf(), MARSSresiduals(), residuals(), predict(), tsSmooth()

Examples

dat <- t(harborSeal)
dat <- dat[c(2, 11, 12), ]
fit <- MARSS(dat, model = list(Z = factor(c("WA", "OR", "OR"))))
fitted(fit)

# Example of fitting a stochastic level model to the Nile River flow data
# red line is smooothed level estimate
# grey line is one-step-ahead prediction using xtt1
nile <- as.vector(datasets::Nile)
mod.list <- list(
  Z = matrix(1), A = matrix(0), R = matrix("r"),
  B = matrix(1), U = matrix(0), Q = matrix("q"),
  x0 = matrix("pi")
)
fit <- MARSS(nile, model = mod.list, silent = TRUE)

# Plotting
# There are plot methods for marssMLE objects
library(ggplot2)
autoplot(fit)

# Below shows how to make plots manually but all of these can be made
# with autoplot(fit) or plot(fit)
plot(nile, type = "p", pch = 16, col = "blue")
lines(fitted(fit, type="ytT")$.fitted, col = "red", lwd = 2)
lines(fitted(fit, type="ytt1")$.fitted, col = "grey", lwd = 2)

# Make a plot of the model estimate of y(t), i.e., put a line through the points
# Intervals are for new data not the blue dots 
# (which were used to fit the model so are not new)
library(ggplot2)
d <- fitted(fit, type = "ytT", interval="confidence", level=0.95)
d2 <- fitted(fit, type = "ytT", interval="prediction", level=0.95)
d$.lwr <- d2$.lwr
d$.upr <- d2$.upr
ggplot(data = d) +
  geom_line(aes(t, .fitted), linewidth=1) +
  geom_point(aes(t, y), color = "blue", na.rm=TRUE) +
  geom_ribbon(aes(x = t, ymin = .conf.low, ymax = .conf.up), alpha = 0.3) +
  geom_line(aes(t, .lwr), linetype = 2) +
  geom_line(aes(t, .upr), linetype = 2) +
  facet_grid(~.rownames) +
  xlab("Time Step") + ylab("Count") +
  ggtitle("Blue=data, Black=estimate, grey=CI, dash=prediction interval") +
  geom_text(x=15, y=7, label="The intervals are for \n new data not the blue dots")


MARSS documentation built on May 29, 2024, 3:34 a.m.