knitr::opts_chunk$set(
  echo = TRUE, 
  fig.height = 4, fig.width = 7, dpi = 128,
  cache.path = "cache/restrictedVAR-cache/", fig.path = "fig/restrictedVAR-fig/", 
  error = FALSE)
options(digits = 4, scipen = 10, width = 70)
palette(
  c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", 
    "#D55E00", "#CC79A7"))

We will simulate and estimate a first order vector auto-regression model in this example using the particle filter and smoother. For details see this vignette which can also be found by calling vignette("Particle_filtering", package = "dynamichazard"). The models we are going to simulate from and estimate are of the form

$$ \begin{split} y_{it} &\sim g(\cdot\vert\eta_{it}) & \ \eta_{it} &= x_i\alpha_{th(i)} + \beta_0 + z_i\beta_{h(i)} + o_{it} \ \vec{\alpha}t &= F\vec{\alpha}{t - 1} + R\vec{\epsilon}_t & \quad \vec{\epsilon}_t \sim N(\vec{0}, Q) \ & & \quad \vec{\alpha}_0 \sim N(\vec{a}_0, Q_0) \end{split}, \qquad \begin{array}{l} i \in {1, \dots, n} \ t = 1, \dots, d \end{array} $$

where the $y_{it}$ is individual $i$'s indicator at time $t$ for whether he dies between time $(t - 1, t]$ and $h:\,{1,\dots n} \rightarrow {1,\dots,4}$ returns the group index individual $i$ belongs to. The indicators, $y_{it}$, are binomial distributed with the complementary log-log link function conditional on knowing the log time at risk, $o_{it}$, covariates, $x_t$ and $z_t$, and latent states, $\vec{\alpha}_1,\dots,\vec{\alpha}_d$. The total survival time of individual $i$ is $T_i$ which is piecewise constant exponentially distributed conditional on knowing the latent states. Further, we set $z_i = x_i$ so the states have a non-zero mean. The true values are

$$ F = \begin{pmatrix} \theta_1 & . & . & \theta_3 \ . & \theta_2 & . & . \ . & . & \theta_2 & . \ \theta_3 & . & . & \theta_1 \end{pmatrix}, \quad Q = \begin{pmatrix} 0.49 & . & 0.175 & . \ . & 0.49 & . & . \ 0.175 & . & 0.25 & . \ . & . & . & 0.25 \end{pmatrix}, \quad R = I_4 $$

$$ \vec{\beta} = (-6.5, -1, -0.5, 0.5, 1)^\top, \quad \vec{\theta} =(0.66, 0.8, 0.2)^\top $$

git_key <- system("git rev-parse --short HEAD", intern = TRUE)
git_bra <- system("git branch", intern = TRUE)
regexp <- "^(\\*\ )(.+)$"
git_bra <- git_bra[grepl(regexp, git_bra)]
git_bra <- gsub(regexp, "\\2", git_bra)

where $I_4$ is the four-dimensional identity matrix, $.$ is used instead of $0$ to put emphasis on the non-zero elements, and $\vec{a}_0$ and $Q_0$ are given by the invariant distribution. The unknown parameters to be estimated is everything but $Q_0$ and $R$ (since we fix $Q_0$ doing the estimation and we set $\vec{a}_0 = (0, 0,0,0)^\top$). This example is run on the git branch "r git_bra" with ID "r git_key". The code can be found on the Github site for the package. All functions which assignments are not shown and are not in the dynamichazard package can be found on the Github site.

We are going to estimate the parameters in an unrestricted model where we estimate the $Q$ and $F$ and a restricted model where we estimate $\vec{\theta}$, parameters for standard deviations which we denote by $\vec{\psi}$, and parameters for the correlations which we denote by $\vec{\phi}$. In the restricted case, we let

$$ \begin{align} \text{vec}(R^+F) &= G\vec{\theta} \ Q &= VCV = \begin{pmatrix} \sigma_1 & . & . & . \ . & \sigma_2 & . & . \ . & . & \sigma_3 & . \ . & . & . & \sigma_4 \end{pmatrix} \begin{pmatrix} 1 & \rho_{21} & \rho_{31} & \rho_{41} \ \rho_{21} & 1 & \rho_{32} & \rho_{42} \ \rho_{31} & \rho_{32} & 1 & \rho_{43} \ \rho_{41} & \rho_{42} & \rho_{43} & 1 \end{pmatrix} \begin{pmatrix} \sigma_1 & . & . & . \ . & \sigma_2 & . & . \ . & . & \sigma_3 & . \ . & . & . & \sigma_4 \end{pmatrix} \ \sigma_i &= \exp(s_i), \qquad (s_1,s_2,s_3,s_4)^\top = J (\psi_1, \psi_2)^\top \ \rho_{ij} &= \frac{2}{1 + \exp(-q_{ij})} - 1, \qquad (q_{21}, q_{31}, q_{41},q_{32}, q_{42},q_{43})^\top =K\phi_1 \end{align} $$

where $\text{vec}(\cdot)$ is the vectorization function and $G$, $J$, and $L$ are

$$ \begin{split} G &= \begin{pmatrix} 1 & . & . & . & . & . & . & . & . & . & . & . & . & . & . & 1 \ . & . & . & . & . & 1 & . & . & . & . & 1 & . & . & . & . & . \ . & . & . & 1 & . & . & . & . & . & . & . & . & 1 & . & . & . \end{pmatrix}^\top \ J &= \begin{pmatrix} 1 & 1 & . & . \ . & . & 1 & 1 \end{pmatrix}^\top \ K &= \begin{pmatrix} \cdot & 1 & \cdot& \cdot& \cdot & \cdot \end{pmatrix}^\top \end{split} $$

Simulation

library(dynamichazard)

We start by simulating the data. Feel free to skip this part as the specifications are given above. First we assign the parameters for the simulation

# assign G, theta, and F
G <- matrix(0., 4^2, 3)
idx <- 1 + (0:(4 - 1)) * 4 + 0:(4 - 1)
G[idx[-(2:3)], 1] <- 1
G[idx[  2:3 ], 2] <- 1
G[c(4, 13)   , 3] <- 1
theta <- c(.66, .8, .2)
(F. <- matrix(as.vector(G %*% theta), 4, 4))

# assign J, K, psi, phi, and Q
J <- matrix(0., 4, 2)
J[1:2, 1] <- J[3:4, 2] <- 1
psi <- log(sqrt(c(.49, .25)))
K <- matrix(0., 6, 1)
K[2, 1] <- 1
phi <- log(- (.5 + 1) / (.5 - 1))

V <- diag(exp(drop(J %*% psi)))
C <- diag(1, ncol(V))
C[lower.tri(C)] <- 2/(1 + exp(-drop(K %*% phi))) - 1
C[upper.tri(C)] <- t(C)[upper.tri(C)]
(Q <- V %*% C %*% V)

# assign Q_0 and beta
Q_0  <- get_Q_0(Q, F.)
beta <- c(-6.5, -1, -0.5, 0.5, 1)

get_Q_0 is a function to get the covariance matrix for the invariant distribution. Then we simulate and plot the latent states

set.seed(54432125)
n_periods <- 200
alphas <- matrix(nrow = n_periods + 1, ncol = 4)
alphas[1, ] <- rnorm(4) %*% chol(Q_0)
for(i in 1:n_periods + 1)
  alphas[i, ] <- F. %*% alphas[i - 1, ] + drop(rnorm(4) %*% chol(Q))

alphas <- t(t(alphas) + beta[-1])

# plot of latent variables
matplot(alphas, type = "l", lty = 1)
for(i in 1:4)
  abline(h = beta[i + 1], lty = 2, col = i)

We simulate the observations as follows

n_obs <- 1000
df <- sapply(1:n_obs, function(i){
  # find the group
  grp <- (i - 1L) %/% (n_obs / 4L) + 1L

  # left-censoring
  tstart <- max(0L, sample.int((n_periods - 1L) * 2L, 1) - n_periods + 1L)

  # covariates
  x <- runif(1, 0, 2)

  # outcome (stop time and event indicator)
  y <- FALSE
  for(tstop in (tstart + 1L):n_periods){
    fail_time <- rexp(1) / exp(beta[1] + x * alphas[tstop + 1L, grp])
    if(fail_time <= 1){
      y <- TRUE
      tstop <- tstop - 1L + fail_time
      break
    }
  }

  c(tstart = tstart, tstop = tstop, x = x, y = y, grp = grp)
})
df <- data.frame(t(df))
df$grp <- factor(df$grp)

# prepare data. Needed to avoid error in `ddFixed` 
df <- within(df, {
  x1 <- x * (grp == 1)
  x2 <- x * (grp == 2)
  x3 <- x * (grp == 3)
  x4 <- x * (grp == 4)
})

We left-censor the observations since we otherwise may end up with a low number of observations towards the end. We show a few properties of the sample

# how many die in each group
xtabs(~ grp + y, df)

# at what time do we have events?
tmp <- aggregate(y ~ grp + as.integer(df$tstop), df, sum, subset = df$y == 1)
tmp$time <- tmp$`as.integer(df$tstop)` + 1L
plot(c(1, n_periods), range(tmp$y, 0), type = "n", xlab = "time", 
     ylab = "# events")
for(i in 1:4){
  dat <- subset(tmp, grp == i)
  x <- dat$time + .2 * (i - 2.5)
  segments(x, rep(0, length(x)), x, dat$y, col = i)
}

Model without latent variables

We can fit a model without the latent variables (i.e., a constant coefficient model) as follows

surv_fit <- survreg(Surv(tstop - tstart, y) ~ x1 + x2 + x3 + x4, df, 
                    dist = "exponential")
summary(surv_fit) # signs are flipped
logLik(surv_fit)

The signs are flipped as stated in help("survreg"). We can compare the log-likelihood with this models with the log-likelihood approximation we get from the particle filter in the next section.

Particle filter and smoother

We use the generalized two-filter smoother from @fearnhead10 were we estimate the full $F$ and $Q$ matrix.

n_threads <- 6 

# you can replace this with e.g.,  
# max(parallel::detectCores(logical = FALSE), 2)))

# control object
dd_ctrl <- PF_control(
  N_fw_n_bw = 200, N_smooth = 500, N_first = 1000, eps = 1e-4,
  method = "AUX_normal_approx_w_cloud_mean", 
  n_max = 500, smoother = "Fearnhead_O_N", averaging_start = 150L,
  nu = 6L, covar_fac = 1.1, n_threads = n_threads)
set.seed(30520116)
system.time(pf_Fear <- PF_EM(
  Surv(tstart, tstop, y) ~ ddFixed_intercept() + x1 + x2 + x3 + x4 + 
    ddFixed(x1) + ddFixed(x2) + ddFixed(x3) + ddFixed(x4), 
  df, Q_0 = diag(1, 4), Q = diag(1, 4), Fmat = diag(.1, 4), 
  by = 1, type = "VAR", model = "exponential", max_T = n_periods,
  control = dd_ctrl))

system.time is used to show the computation time. The estimates are

pf_Fear$Q
show_covar <- function(Q){
  cat("Standard deviations\n")
  print(sqrt(diag(Q)))
  cat("Lower correlation matrix\n")
  tmp <- cov2cor(Q)
  tmp[upper.tri(tmp, diag = TRUE)] <- NA_real_
  print(tmp[-1, -ncol(tmp)], na.print = "")
}
show_covar(pf_Fear$Q)
show_covar(Q) # the true values

pf_Fear$F
pf_Fear$fixed_effects # beta

We can plot the smoothed state estimates as follows

tmp <- t(t(alphas) - beta[-1])[-1, ]
par(mfcol = c(2, 2), mar = c(5, 4, 1, 1))
for(i in 1:4){
  plot(pf_Fear, qlvls = c(0.025, 0.975), ylim = range(tmp),
       cov_index = i, col = i)
  points(1:nrow(tmp), tmp[, i], pch = 16, col = i)
}

The crosses are the point-wise confidence bounds, the lines are the smoothed means, and the full dots are the actual values. The approximate log-likelihood at each iteration of the EM algorithm are

logLik(pf_Fear)
plot(pf_Fear$log_likes, type = "l", ylab = "log-likelihood")
# last elements
plot(tail(pf_Fear$log_likes, 50), type = "l", ylab = "log-likelihood")

Restricted model

We fit the model where we only estimate $\vec{\theta}$, $\vec{\psi}$, and $\vec{\phi}$ as follows

set.seed(30520116)
pf_Fear_restric <- PF_EM(
  Surv(tstart, tstop, y) ~ ddFixed_intercept() + x1 + x2 + x3 + x4 + 
    ddFixed(x1) + ddFixed(x2) + ddFixed(x3) + ddFixed(x4), 
  df, Q_0 = diag(1, 4), G = G, J = J, theta = c(.1, .1, 0), K = K, 
  psi = c(0, 0), phi = 0,
  by = 1, type = "VAR", model = "exponential", max_T = n_periods,
  control = dd_ctrl)

Notice that we pass G, J, K, theta, psi, and phi instead of Q and Fmat. The estimates are

pf_Fear_restric$Q
show_covar(pf_Fear_restric$Q)

pf_Fear_restric$F
pf_Fear_restric$fixed_effects # beta

The smoothed state estimates looks as follows

tmp <- t(t(alphas) - beta[-1])[-1, ]
par(mfcol = c(2, 2), mar = c(5, 4, 1, 1))
for(i in 1:4){
  plot(pf_Fear_restric, qlvls = c(0.025, 0.975), ylim = range(tmp),
       cov_index = i, col = i)
  points(1:nrow(tmp), tmp[, i], pch = 16, col = i)
}

The approximate log-likelihood at each iteration of the EM algorithm are

logLik(pf_Fear_restric)
plot(pf_Fear_restric$log_likes, type = "l", ylab = "log-likelihood")
# last elements
plot(tail(pf_Fear_restric$log_likes, 50), type = "l", ylab = "log-likelihood")

We can get a better estimate of the log-likelihood at the final estimate for the two models by running the particle filter with more particles as follows

old_dig <- getOption("digits")
options(digits = 7)
logLik(PF_forward_filter(pf_Fear        , N_fw = 10000, N_first = 10000)) 
logLik(PF_forward_filter(pf_Fear_restric, N_fw = 10000, N_first = 10000))
options(digits = old_dig)

We expect the former to have a higher log-likelihood since we have $4^2 + 4(4 + 1) / 2 - 2 \cdot 3 = 20$ more parameters in the former model.

Misspecified model

Let suppose that we think an AR(1) process is appropriate for each state variable and the state variables are independent. That is, we want to estimate

$$ F = \begin{pmatrix} \theta_1 & . & . & . \ . & \theta_2 & . & . \ . & . & \theta_3 & . \ . & . & . & \theta_4 \end{pmatrix}, \quad Q = \begin{pmatrix} \exp(2\psi_1) & . & . & . \ . & \exp(2\psi_2) & . & . \ . & . & \exp(2\psi_3) & . \ . & . & . & \exp(2\psi_4) \end{pmatrix} $$ We can estimate such a model by making the following calls

# setup G and J
G_miss <- matrix(0., 4^2, 4)
for(i in 0:3)
  G_miss[1 + i * 4 + i, i + 1] <- 1
J_miss <- diag(4)
K_miss <- matrix(nrow = 4 * (4 - 1) / 2, ncol = 0)

# estimate
set.seed(30520116)
pf_Fear_miss <- PF_EM(
  Surv(tstart, tstop, y) ~ ddFixed_intercept() + x1 + x2 + x3 + x4 + 
    ddFixed(x1) + ddFixed(x2) + ddFixed(x3) + ddFixed(x4), 
  df, Q_0 = diag(1, 4), G = G_miss, J = J_miss, K = K_miss, 
  theta = rep(.1, 4), psi = rep(0, 4), phi = vector(),
  by = 1, type = "VAR", model = "exponential", max_T = n_periods,
  control = dd_ctrl)

The estimates from this model are

pf_Fear_miss$Q
show_covar(pf_Fear_miss$Q)
pf_Fear_miss$F
pf_Fear_miss$fixed_effects
logLik(pf_Fear_miss)

An approximation of the log-likelihood is

old_dig <- getOption("digits")
options(digits = 7)
logLik(PF_forward_filter(pf_Fear_miss, N_fw = 10000, N_first = 10000)) 
options(digits = old_dig)

References



boennecd/dynamichazard documentation built on Oct. 11, 2022, 2:41 p.m.