knitr::opts_chunk$set( echo = TRUE, fig.height = 4, fig.width = 7, dpi = 128, cache.path = "cache/restrictedVAR-cache/", fig.path = "fig/restrictedVAR-fig/", error = FALSE) options(digits = 4, scipen = 10, width = 70)
palette( c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7"))
We will simulate and estimate a first order vector auto-regression model in
this example using the particle filter and smoother. For details see
this vignette which can also be found by
calling vignette("Particle_filtering", package = "dynamichazard")
. The models
we are going to simulate from and estimate are of the form
$$ \begin{split} y_{it} &\sim g(\cdot\vert\eta_{it}) & \ \eta_{it} &= x_i\alpha_{th(i)} + \beta_0 + z_i\beta_{h(i)} + o_{it} \ \vec{\alpha}t &= F\vec{\alpha}{t - 1} + R\vec{\epsilon}_t & \quad \vec{\epsilon}_t \sim N(\vec{0}, Q) \ & & \quad \vec{\alpha}_0 \sim N(\vec{a}_0, Q_0) \end{split}, \qquad \begin{array}{l} i \in {1, \dots, n} \ t = 1, \dots, d \end{array} $$
where the $y_{it}$ is individual $i$'s indicator at time $t$ for whether he dies between time $(t - 1, t]$ and $h:\,{1,\dots n} \rightarrow {1,\dots,4}$ returns the group index individual $i$ belongs to. The indicators, $y_{it}$, are binomial distributed with the complementary log-log link function conditional on knowing the log time at risk, $o_{it}$, covariates, $x_t$ and $z_t$, and latent states, $\vec{\alpha}_1,\dots,\vec{\alpha}_d$. The total survival time of individual $i$ is $T_i$ which is piecewise constant exponentially distributed conditional on knowing the latent states. Further, we set $z_i = x_i$ so the states have a non-zero mean. The true values are
$$ F = \begin{pmatrix} \theta_1 & . & . & \theta_3 \ . & \theta_2 & . & . \ . & . & \theta_2 & . \ \theta_3 & . & . & \theta_1 \end{pmatrix}, \quad Q = \begin{pmatrix} 0.49 & . & 0.175 & . \ . & 0.49 & . & . \ 0.175 & . & 0.25 & . \ . & . & . & 0.25 \end{pmatrix}, \quad R = I_4 $$
$$ \vec{\beta} = (-6.5, -1, -0.5, 0.5, 1)^\top, \quad \vec{\theta} =(0.66, 0.8, 0.2)^\top $$
git_key <- system("git rev-parse --short HEAD", intern = TRUE) git_bra <- system("git branch", intern = TRUE) regexp <- "^(\\*\ )(.+)$" git_bra <- git_bra[grepl(regexp, git_bra)] git_bra <- gsub(regexp, "\\2", git_bra)
where $I_4$ is the four-dimensional identity matrix, $.$ is used instead of $0$
to put emphasis on the non-zero elements, and
$\vec{a}_0$ and $Q_0$ are given by the invariant distribution. The unknown
parameters to be estimated is everything but $Q_0$ and $R$ (since we fix $Q_0$ doing the estimation and we set $\vec{a}_0 = (0, 0,0,0)^\top$). This
example is run on the git branch "r git_bra
" with ID "r git_key
". The code
can be found on
the Github site for the package.
All functions which assignments are not shown and are not in the
dynamichazard
package can be found on the Github site.
We are going to estimate the parameters in an unrestricted model where we estimate the $Q$ and $F$ and a restricted model where we estimate $\vec{\theta}$, parameters for standard deviations which we denote by $\vec{\psi}$, and parameters for the correlations which we denote by $\vec{\phi}$. In the restricted case, we let
$$ \begin{align} \text{vec}(R^+F) &= G\vec{\theta} \ Q &= VCV = \begin{pmatrix} \sigma_1 & . & . & . \ . & \sigma_2 & . & . \ . & . & \sigma_3 & . \ . & . & . & \sigma_4 \end{pmatrix} \begin{pmatrix} 1 & \rho_{21} & \rho_{31} & \rho_{41} \ \rho_{21} & 1 & \rho_{32} & \rho_{42} \ \rho_{31} & \rho_{32} & 1 & \rho_{43} \ \rho_{41} & \rho_{42} & \rho_{43} & 1 \end{pmatrix} \begin{pmatrix} \sigma_1 & . & . & . \ . & \sigma_2 & . & . \ . & . & \sigma_3 & . \ . & . & . & \sigma_4 \end{pmatrix} \ \sigma_i &= \exp(s_i), \qquad (s_1,s_2,s_3,s_4)^\top = J (\psi_1, \psi_2)^\top \ \rho_{ij} &= \frac{2}{1 + \exp(-q_{ij})} - 1, \qquad (q_{21}, q_{31}, q_{41},q_{32}, q_{42},q_{43})^\top =K\phi_1 \end{align} $$
where $\text{vec}(\cdot)$ is the vectorization function and $G$, $J$, and $L$ are
$$ \begin{split} G &= \begin{pmatrix} 1 & . & . & . & . & . & . & . & . & . & . & . & . & . & . & 1 \ . & . & . & . & . & 1 & . & . & . & . & 1 & . & . & . & . & . \ . & . & . & 1 & . & . & . & . & . & . & . & . & 1 & . & . & . \end{pmatrix}^\top \ J &= \begin{pmatrix} 1 & 1 & . & . \ . & . & 1 & 1 \end{pmatrix}^\top \ K &= \begin{pmatrix} \cdot & 1 & \cdot& \cdot& \cdot & \cdot \end{pmatrix}^\top \end{split} $$
library(dynamichazard)
We start by simulating the data. Feel free to skip this part as the specifications are given above. First we assign the parameters for the simulation
# assign G, theta, and F G <- matrix(0., 4^2, 3) idx <- 1 + (0:(4 - 1)) * 4 + 0:(4 - 1) G[idx[-(2:3)], 1] <- 1 G[idx[ 2:3 ], 2] <- 1 G[c(4, 13) , 3] <- 1 theta <- c(.66, .8, .2) (F. <- matrix(as.vector(G %*% theta), 4, 4)) # assign J, K, psi, phi, and Q J <- matrix(0., 4, 2) J[1:2, 1] <- J[3:4, 2] <- 1 psi <- log(sqrt(c(.49, .25))) K <- matrix(0., 6, 1) K[2, 1] <- 1 phi <- log(- (.5 + 1) / (.5 - 1)) V <- diag(exp(drop(J %*% psi))) C <- diag(1, ncol(V)) C[lower.tri(C)] <- 2/(1 + exp(-drop(K %*% phi))) - 1 C[upper.tri(C)] <- t(C)[upper.tri(C)] (Q <- V %*% C %*% V) # assign Q_0 and beta Q_0 <- get_Q_0(Q, F.) beta <- c(-6.5, -1, -0.5, 0.5, 1)
get_Q_0
is a function to get the covariance matrix for the invariant distribution.
Then we simulate and plot the latent states
set.seed(54432125) n_periods <- 200 alphas <- matrix(nrow = n_periods + 1, ncol = 4) alphas[1, ] <- rnorm(4) %*% chol(Q_0) for(i in 1:n_periods + 1) alphas[i, ] <- F. %*% alphas[i - 1, ] + drop(rnorm(4) %*% chol(Q)) alphas <- t(t(alphas) + beta[-1]) # plot of latent variables matplot(alphas, type = "l", lty = 1) for(i in 1:4) abline(h = beta[i + 1], lty = 2, col = i)
We simulate the observations as follows
n_obs <- 1000 df <- sapply(1:n_obs, function(i){ # find the group grp <- (i - 1L) %/% (n_obs / 4L) + 1L # left-censoring tstart <- max(0L, sample.int((n_periods - 1L) * 2L, 1) - n_periods + 1L) # covariates x <- runif(1, 0, 2) # outcome (stop time and event indicator) y <- FALSE for(tstop in (tstart + 1L):n_periods){ fail_time <- rexp(1) / exp(beta[1] + x * alphas[tstop + 1L, grp]) if(fail_time <= 1){ y <- TRUE tstop <- tstop - 1L + fail_time break } } c(tstart = tstart, tstop = tstop, x = x, y = y, grp = grp) }) df <- data.frame(t(df)) df$grp <- factor(df$grp) # prepare data. Needed to avoid error in `ddFixed` df <- within(df, { x1 <- x * (grp == 1) x2 <- x * (grp == 2) x3 <- x * (grp == 3) x4 <- x * (grp == 4) })
We left-censor the observations since we otherwise may end up with a low number of observations towards the end. We show a few properties of the sample
# how many die in each group xtabs(~ grp + y, df) # at what time do we have events? tmp <- aggregate(y ~ grp + as.integer(df$tstop), df, sum, subset = df$y == 1) tmp$time <- tmp$`as.integer(df$tstop)` + 1L plot(c(1, n_periods), range(tmp$y, 0), type = "n", xlab = "time", ylab = "# events") for(i in 1:4){ dat <- subset(tmp, grp == i) x <- dat$time + .2 * (i - 2.5) segments(x, rep(0, length(x)), x, dat$y, col = i) }
We can fit a model without the latent variables (i.e., a constant coefficient model) as follows
surv_fit <- survreg(Surv(tstop - tstart, y) ~ x1 + x2 + x3 + x4, df, dist = "exponential") summary(surv_fit) # signs are flipped logLik(surv_fit)
The signs are flipped as stated in help("survreg")
. We can compare the
log-likelihood with this models with the log-likelihood approximation we get
from the particle filter in the next section.
We use the generalized two-filter smoother from @fearnhead10 were we estimate the full $F$ and $Q$ matrix.
n_threads <- 6 # you can replace this with e.g., # max(parallel::detectCores(logical = FALSE), 2))) # control object dd_ctrl <- PF_control( N_fw_n_bw = 200, N_smooth = 500, N_first = 1000, eps = 1e-4, method = "AUX_normal_approx_w_cloud_mean", n_max = 500, smoother = "Fearnhead_O_N", averaging_start = 150L, nu = 6L, covar_fac = 1.1, n_threads = n_threads)
set.seed(30520116) system.time(pf_Fear <- PF_EM( Surv(tstart, tstop, y) ~ ddFixed_intercept() + x1 + x2 + x3 + x4 + ddFixed(x1) + ddFixed(x2) + ddFixed(x3) + ddFixed(x4), df, Q_0 = diag(1, 4), Q = diag(1, 4), Fmat = diag(.1, 4), by = 1, type = "VAR", model = "exponential", max_T = n_periods, control = dd_ctrl))
system.time
is used to show the computation time. The estimates are
pf_Fear$Q show_covar <- function(Q){ cat("Standard deviations\n") print(sqrt(diag(Q))) cat("Lower correlation matrix\n") tmp <- cov2cor(Q) tmp[upper.tri(tmp, diag = TRUE)] <- NA_real_ print(tmp[-1, -ncol(tmp)], na.print = "") } show_covar(pf_Fear$Q) show_covar(Q) # the true values pf_Fear$F pf_Fear$fixed_effects # beta
We can plot the smoothed state estimates as follows
tmp <- t(t(alphas) - beta[-1])[-1, ] par(mfcol = c(2, 2), mar = c(5, 4, 1, 1)) for(i in 1:4){ plot(pf_Fear, qlvls = c(0.025, 0.975), ylim = range(tmp), cov_index = i, col = i) points(1:nrow(tmp), tmp[, i], pch = 16, col = i) }
The crosses are the point-wise confidence bounds, the lines are the smoothed means, and the full dots are the actual values. The approximate log-likelihood at each iteration of the EM algorithm are
logLik(pf_Fear) plot(pf_Fear$log_likes, type = "l", ylab = "log-likelihood") # last elements plot(tail(pf_Fear$log_likes, 50), type = "l", ylab = "log-likelihood")
We fit the model where we only estimate $\vec{\theta}$, $\vec{\psi}$, and $\vec{\phi}$ as follows
set.seed(30520116) pf_Fear_restric <- PF_EM( Surv(tstart, tstop, y) ~ ddFixed_intercept() + x1 + x2 + x3 + x4 + ddFixed(x1) + ddFixed(x2) + ddFixed(x3) + ddFixed(x4), df, Q_0 = diag(1, 4), G = G, J = J, theta = c(.1, .1, 0), K = K, psi = c(0, 0), phi = 0, by = 1, type = "VAR", model = "exponential", max_T = n_periods, control = dd_ctrl)
Notice that we pass G
, J
, K
, theta
, psi
, and phi
instead of Q
and Fmat
.
The estimates are
pf_Fear_restric$Q show_covar(pf_Fear_restric$Q) pf_Fear_restric$F pf_Fear_restric$fixed_effects # beta
The smoothed state estimates looks as follows
tmp <- t(t(alphas) - beta[-1])[-1, ] par(mfcol = c(2, 2), mar = c(5, 4, 1, 1)) for(i in 1:4){ plot(pf_Fear_restric, qlvls = c(0.025, 0.975), ylim = range(tmp), cov_index = i, col = i) points(1:nrow(tmp), tmp[, i], pch = 16, col = i) }
The approximate log-likelihood at each iteration of the EM algorithm are
logLik(pf_Fear_restric) plot(pf_Fear_restric$log_likes, type = "l", ylab = "log-likelihood") # last elements plot(tail(pf_Fear_restric$log_likes, 50), type = "l", ylab = "log-likelihood")
We can get a better estimate of the log-likelihood at the final estimate for the two models by running the particle filter with more particles as follows
old_dig <- getOption("digits") options(digits = 7)
logLik(PF_forward_filter(pf_Fear , N_fw = 10000, N_first = 10000)) logLik(PF_forward_filter(pf_Fear_restric, N_fw = 10000, N_first = 10000))
options(digits = old_dig)
We expect the former to have a higher log-likelihood since we have $4^2 + 4(4 + 1) / 2 - 2 \cdot 3 = 20$ more parameters in the former model.
Let suppose that we think an AR(1) process is appropriate for each state variable and the state variables are independent. That is, we want to estimate
$$ F = \begin{pmatrix} \theta_1 & . & . & . \ . & \theta_2 & . & . \ . & . & \theta_3 & . \ . & . & . & \theta_4 \end{pmatrix}, \quad Q = \begin{pmatrix} \exp(2\psi_1) & . & . & . \ . & \exp(2\psi_2) & . & . \ . & . & \exp(2\psi_3) & . \ . & . & . & \exp(2\psi_4) \end{pmatrix} $$ We can estimate such a model by making the following calls
# setup G and J G_miss <- matrix(0., 4^2, 4) for(i in 0:3) G_miss[1 + i * 4 + i, i + 1] <- 1 J_miss <- diag(4) K_miss <- matrix(nrow = 4 * (4 - 1) / 2, ncol = 0) # estimate set.seed(30520116) pf_Fear_miss <- PF_EM( Surv(tstart, tstop, y) ~ ddFixed_intercept() + x1 + x2 + x3 + x4 + ddFixed(x1) + ddFixed(x2) + ddFixed(x3) + ddFixed(x4), df, Q_0 = diag(1, 4), G = G_miss, J = J_miss, K = K_miss, theta = rep(.1, 4), psi = rep(0, 4), phi = vector(), by = 1, type = "VAR", model = "exponential", max_T = n_periods, control = dd_ctrl)
The estimates from this model are
pf_Fear_miss$Q show_covar(pf_Fear_miss$Q) pf_Fear_miss$F pf_Fear_miss$fixed_effects logLik(pf_Fear_miss)
An approximation of the log-likelihood is
old_dig <- getOption("digits") options(digits = 7)
logLik(PF_forward_filter(pf_Fear_miss, N_fw = 10000, N_first = 10000))
options(digits = old_dig)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.