ddhazard: Fitting Dynamic Hazard Models
In dynamichazard: Dynamic Hazard Models using State Space Models

ddhazard

R Documentation

Fitting Dynamic Hazard Models

Description

Function to fit dynamic hazard models using state space models.

Usage

ddhazard(
  formula,
  data,
  model = "logit",
  by,
  max_T,
  id,
  a_0,
  Q_0,
  Q = Q_0,
  order = 1,
  weights,
  control = ddhazard_control(),
  verbose = FALSE
)

Arguments

`formula`	`coxph` like formula with `Surv(tstart, tstop, event)` on the left hand site of `~`.
`data`	`data.frame` or environment containing the outcome and covariates.
`model`	`"logit"`, `"cloglog"`, or `"exponential"` for respectively the logistic link function with discrete outcomes, the inverse cloglog link function with discrete outcomes, or for the continuous time model with piecewise constant exponentially distributed arrival times.
`by`	interval length of the bins in which parameters are fixed.
`max_T`	end of the last interval interval.
`id`	vector of ids for each row of the in the design matrix.
`a_0`	vector a_0 for the initial coefficient vector for the first iteration (optional). Default is estimates from static model (see `static_glm`).
`Q_0`	covariance matrix for the prior distribution.
`Q`	initial covariance matrix for the state equation.
`order`	order of the random walk.
`weights`	weights to use if e.g. a skewed sample is used.
`control`	list of control variables from `ddhazard_control`.
`verbose`	`TRUE` if you want status messages during execution.

Details

This function can be used to estimate survival models where the regression parameters follows a given order random walk. The order is specified by the order argument. 1. and 2. order random walks is implemented. The regression parameters are updated at time by, 2by, ..., max_T. See the vignette("ddhazard", "dynamichazard") for details.

All filter methods needs a state covariance matrix Q_0 and state vector a_0. An estimate from a time-invariant model is used for a_0 if it is not supplied (the same model you would get from static_glm). A diagonal matrix with large entries is recommended for Q_0. What is large dependents on the data set and model. Further, a covariance matrix for the first iteration Q is needed. The Q and a_0 are estimated with an EM-algorithm.

The model is specified through the model argument. The discrete outcome models are where outcomes are binned into the intervals. Be aware that there can be "loss" of information due to binning if outcomes are not discrete to start with. It is key for these models that the id argument is provided if individuals in the data set have time-varying covariates. The the exponential model use a piecewise constant exponential distribution for the arrival times where there is no "loss" information due to binning. Though, one of the assumptions of the model is not satisfied if outcomes are only observed in discrete time intervals.

It is recommended to see the Shiny app demo for this function by calling ddhazard_app().

Value

A list with class ddhazard. The list contains

`formula`	the passed formula.
`call`	the matched call.
`state_vecs`	2D matrix with the estimated state vectors (regression parameters) in each bin.
`state_vars`	3D array with smoothed variance estimates for each state vector.
`lag_one_cov`	3D array with lagged correlation matrix for each for each change in the state vector. Only present when the model is logit and the method is EKF.
`n_risk`	the number of observations in each interval.
`times`	the interval borders.
`risk_set`	the object from `get_risk_obj` if saved.
`data`	the `data` argument if saved.
`weights`	`weights` used in estimation if saved.
`id`	ids used to match rows in `data` to individuals.
`order`	order of the random walk.
`F_`	matrix which map from one state vector to the next.
`method`	method used in the E-step.
`est_Q_0`	`TRUE` if `Q_0` was estimated in the EM-algorithm.
`family`	Rcpp `Module` with C++ functions used for estimation given the `model` argument.
`discrete_hazard_func`	the hazard function corresponding to the `model` argument.
`terms`	the `terms` object used.
`has_fixed_intercept`	`TRUE` if the model has a time-invariant intercept.
`xlev`	a record of the levels of the factors used in fitting.

References

Fahrmeir, Ludwig. Dynamic modelling and penalized likelihood estimation for discrete time survival data. Biometrika 81.2 (1994): 317-330.

Durbin, James, and Siem Jan Koopman. Time series analysis by state space methods. No. 38. Oxford University Press, 2012.

Christoffersen, Benjamin. dynamichazard: Dynamic Hazard Models Using State Space Models. Journal of Statistical Software 99.7 (2021): 1-38.

Examples

# example with first order model
library(dynamichazard)
fit <- ddhazard(
 Surv(time, status == 2) ~ log(bili), pbc, id = pbc$id, max_T = 3600,
 Q_0 = diag(1, 2), Q = diag(1e-4, 2), by = 50,
 control = ddhazard_control(method = "GMA"))
plot(fit)

# example with second order model
fit <- ddhazard(
 Surv(time, status == 2) ~ log(bili), pbc, id = pbc$id, max_T = 3600,
 Q_0 = diag(1, 4), Q = diag(1e-4, 2), by = 50,
 control = ddhazard_control(method = "GMA"),
 order = 2)
plot(fit)

dynamichazard documentation built on Oct. 6, 2022, 1:08 a.m.