# semi.markov.3states.ic: 3-State Semi-Markov Model With Interval-Censored Data In RISCA: Causal Inference and Prediction in Cohort-Based Analyses

## Description

The 3-state SM model includes an initial state (X=1), a transient state (X=2) and an absorbing state (X=3). Usually, X=1 corresponds to disease-free or remission, X=2 to relapse, and X=3 to death. In this illness-death model, the possible transitions are: 1->2, 1->3 and 2->3. The time from X=1 to X=2 is interval-censored. Parameters are estimated by (weighted) Likelihood maximization.

## Usage

 1 2 3 4 5 6 7 8 semi.markov.3states.ic(times0, times1, times2, sequences, weights=NULL, dist, cuts.12=NULL, cuts.13=NULL, cuts.23=NULL, ini.dist.12=NULL, ini.dist.13=NULL, ini.dist.23=NULL, cov.12=NULL, init.cov.12=NULL, names.12=NULL, cov.13=NULL, init.cov.13=NULL, names.13=NULL, cov.23=NULL, init.cov.23=NULL, names.23=NULL, conf.int=TRUE, silent=TRUE, precision=10^(-6), legendre=30, homogeneous=TRUE)

## Arguments

 times0 A numeric vector with the observed times in days from baseline to the last observation time in X=1. times1 A numeric vector with the observed times in days from baseline to the first observation time in X=2. NA for individuals right-censored in X=1 or individuals who are directly in X=3 after X=1 (without any observation in X=2). times2 A numeric vector with the observed times in days from baseline to the last follow-up. sequences A numeric vector with the sequences of observed states. Four possible values are allowed: 1 (individual right-censored in X=1), 12 (individual right-censored in X=2), 13 (individual who directly observed in X=3 after X=3, without any observation of X=2), 123 (individual who transited from X=1 to X=3 through X=2). weights A numeric vector with the weights for correcting the contribution of each individual. When the vector is completed, the IPW estimator is implemented. Default is NULL which means that no weighting is applied. dist A character vector with three arguments describing respectively the distributions of duration time for transitions 1->2, 1->3 and 2->3. Arguments allowed are "E" for Exponential distribution, "PE" for the piecewise exponential distribution, "W" for Weibull distribution or "WG" for Generalized Weibull distribution. When the user choose "PE", the arguments "cut.XX" have also to be defined. cuts.12 A numeric vector indicating the timepoints in days for the piecewise exponential distribution related to the time from X=1 to X=2. Only internal timepoints are allowed: timepoints cannot be 0 or Inf. Default is NULL which means that the distribution is not piecewise. Piecewise model is only allowed for exponential distribution. cuts.13 A numeric vector indicating the timepoints in days for the piecewise exponential distribution related to the time from X=1 to X=3. Only internal timepoints are allowed: timepoints cannot be 0 or Inf. Default is NULL which means that the distribution is not piecewise. Piecewise model is only allowed for exponential distribution. cuts.23 A numeric vector indicating the timepoints in days for the piecewise exponential distribution related to the time from X=2 to X=3. Only internal timepoints are allowed: timepoints cannot be 0 or Inf. Default is NULL which means that the distribution is not piecewise. Piecewise model is only allowed for exponential distribution. ini.dist.12 A numeric vector of initial values for the distribution from X=1 to X=2. The logarithm of the parameters have to be declared. Default value is 1. ini.dist.13 A numeric vector of initial values for the distribution from X=1 to X=3. The logarithm of the parameters have to be declared. Default value is 1. ini.dist.23 A numeric vector of initial values for the distribution from X=2 to X=3. The logarithm of the parameters have to be declared. Default value is 1. cov.12 A matrix (or data frame) with the explicative time-fixed variable(s) related to the time from X=1 to X=2. init.cov.12 A numeric vector of initial values for regression coefficients (logarithm of the cause-specific hazards ratios) associated to cov.12. Default initial value is 0. names.12 An optional character vector with name of explicative variables associated to cov.12. cov.13 A numeric matrix (or data frame) with the explicative time-fixed variable(s) related to the time from X=1 to X=3. init.cov.13 A numeric vector of initial values for regression coefficients (logarithm of the cause-specific hazards ratios) associated to cov.13. Default initial value is 0. names.13 An optional character vector with name of explicative variables associated to cov.13. cov.23 A numeric matrix (or data frame) with the explicative time-fixed variable(s) related to the time from X=2 to X=3. init.cov.23 A numeric vector of initial values for regression coefficients (logarithm of the cause-specific hazards ratios) associated to cov.23. Default initial value is 0. names.23 An optional character vector with name of explicative variables associated to cov.23. conf.int A logical value specifying if the pointwise confidence intervals for parameters and the variance-covariance matrix should be returned. Default is TRUE. silent A logical value specifying if the log-likelihood value should be returned at each iteration. Default is TRUE, which corresponds to silent mode (no display). precision A numeric positive value indicating the required precision for the log-likelihood maximization between each iteration. Default is 10^{-6}. legendre A numeric value indicating the number of knots and weights for Gaussian quadrature used in convolution products. Default is 30. homogeneous A logical value specifying if the time spent in the state X=1 is considered as non-associated with the distribution of the time from the entry in the state X=2 to the transition in the state X=3. Default is TRUE, assuming no association.

## Details

Hazard functions available are:

 Exponential distribution λ(t)=1/σ Weibull distribution λ(t)=ν(\frac{1}{σ})^{ν}t^{ν-1} Generalized Weibull distribution λ(t)=\frac{1}{θ}≤ft(1+≤ft(\frac{t}{σ}\right)^{ν}\right)^{\frac{1}{θ}-1} ν≤ft(\frac{1}{σ}\right)^{ν} t^{ν-1}

with σ, ν,and θ>0. The parameter σ varies for each interval when the distribution is piecewise Exponential. We advise to initialize the logarithm of these parameters in ini.dist.12, ini.dist.13 and ini.dist.23.

To estimate the marginal effect of a binary exposure, the weights may be equal to 1/p, where p is the estimated probability that the individual belongs to his or her own observed group of exposure. The probabilities p are often estimated by a logistic regression in which the dependent binary variable is the exposure. The possible confounding factors are the explanatory variables of this logistic model.

Two kinds of model can be estimated: homogeneous and non-homogeneous semi-Markov model. In the first one, the hazard functions only depend on the times spent in the corresponding state. Note that for the transitions from the state X=1, the time spent in the state corresponds to the chronological time from the baseline of the study, as for Markov models. In the second one, the hazard function of the transition from the state X=2 to X=3 depends on two time scales: the time spent in the state 2 which is the random variable of interest, and the time spend in the state X=1 as a covariate.

## Value

 object The character string indicating the model: "semi.markov.3states.ic (3-state semi-markov model with interval-censored data)". dist A character vector with two arguments describing respectively the distributions of duration time for transitions 1->2, 1->3 and 2->3. cuts.12 A numeric vector indicating the timepoints in days for the piecewise exponential distribution related to the time from X=1 to X=2. cuts.13 A numeric vector indicating the timepoints in days for the piecewise exponential distribution related to the time from X=1 to X=3. cuts.23 A numeric vector indicating the timepoints in days for the piecewise exponential distribution related to the time from X=2 to X=3. covariates A numeric vector indicating the numbers of covariates respectively related to the transition 1->2, 1->3 and 2->3. table A data frame containing the estimated parameters of the model (Estimate). When the option conf.int=TRUE is specified, this data frame includes three additional columns: the Standard Errors of parameters (Std.Error), the value of the Wald statistic (t.value), and the related p-value for the Wald test (Pr(>|t|)). cov.matrix A data frame corresponding to variance-covariance matrix of the parameters. LogLik A numeric value corresponding to the (weighted) log-likelihood of the model. AIC A numeric value corresponding to the Akaike Information Criterion of the model.

## Author(s)

Yohann Foucher <Yohann.Foucher@univ-nantes.fr>

Florence Gillaizeau <Florence.Gillaizeau@univ-nantes.fr>

## References

Gillaizeau et al. Inverse Probability Weighting to control confounding in an illness-death model for interval-censored data. Stat Med. 37(8):1245-1258, 2018. <doi: 10.1002/sim.7550>.

## Examples

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 # The example is too long to compute for a submission on the CRAN # Remove the characters '#' # import the observed data (read the application in Gillaizeau et al. for more details) # X=1 corresponds to initial state with a functioning graft, X=2 to acute rejection episode, # X=3 to return to dialysis, X=4 to death with a functioning graft # data(dataDIVAT1) # A subgroup analysis to reduce the time needed for this example # dataDIVAT1$id<-c(1:nrow(dataDIVAT1)) # set.seed(2) # d3<-dataDIVAT1[dataDIVAT1$id %in% sample(dataDIVAT1$id, 100, replace = FALSE),] # To illustrate the use of a 3-state model, the return in dialysis are right-censored # d3$trajectory[d3$trajectory==13]<-1 # d3$trajectory[d3$trajectory==123]<-12 # d3$trajectory[d3$trajectory==14]<-13 # d3$trajectory[d3$trajectory==124]<-123 # table(d3$trajectory) # X=2 is supposed to be interval-censored between 'times0' and 'times1' because # health examinations take place each year after inclusion # d3$times0<-NA # d3$times1<-NA # d3$time2_<-NA # i<-d3$trajectory==1 # d3$times0[i]<-trunc(d3$time1[i]/365.24)*365.24+1 # d3$times1[i]<-NA # d3$times2[i]<- d3$time1[i]+1 # i<-d3$trajectory==12 # d3$times0[i]<-trunc(d3$time1[i]/365.24)*365.24+1 # d3$times1[i]<-(trunc(d3$time1[d3$trajectory==12]/365.24)+1)*365.24 # d3$times2[i]<-pmax(d3$time2[i], (trunc(d3$time1[i]/365.24)+2)*365.24) # i<-d3$trajectory==13 # d3$times0[i]<-trunc(d3$time1[i]/365.24)*365.24+1 # d3$times1[i]<-NA # d3$times2[i]<-d3$time1[i] # i<-d3$trajectory==123 # d3$times0[i]<-trunc(d3$time1[i]/365.24)*365.24+1 # d3$times1[i]<-(trunc(d3$time1[i]/365.24)+1)*365.24 # d3$times2[i]<- pmax(d3$time2[i], (trunc(d3$time1[i]/365.24)+2)*365.24) # 3-state homogeneous semi-Markov model with interval-censored data # including one binary explicative variable (z is 1 if delayed graft function and # 0 otherwise). # Estimation of the marginal effect of z on the transition from X=1 to X=2 # by adjusting for 2 possible confounding factors (age and gender) # We only reduced the precision and the number of iteration to save time in this example, # prefer the default values. # propensity.score <- glm(z ~ ageR + sexR, family=binomial(link="logit"),data=d3) # d3$fit<-propensity.score$fitted.values # p1<-mean(d3$z) # d3$w <- p1/d3$fit # d3$w[d3$z==0]<-(1-p1)/(1-d3$fit[d3$z==0]) # semi.markov.3states.ic(times0=d3$times0, times1=d3$times1, # times2=d3$times2, sequences=d3$trajectory, # weights=d3$w, dist=c("E","E","E"), cuts.12=NULL, cuts.13=NULL, cuts.23=NULL, # ini.dist.12=c(8.23), ini.dist.13=c(10.92), ini.dist.23=c(10.67), # cov.12=d3$z, init.cov.12=c(0.02), names.12=c("beta12_z"), # conf.int=TRUE, silent=FALSE, precision=0.001, legendre=20)$table

RISCA documentation built on Nov. 19, 2020, 1:07 a.m.