# genTDCM: Generating data from a Cox model with time-dependent... In genSurv: Generating Multi-State Survival Data

## Description

Generating data from a Cox model with time-dependent covariates.

## Usage

 `1` ```genTDCM(n, dist, corr, dist.par, model.cens, cens.par, beta, lambda) ```

## Arguments

 `n` Sample size. `dist` Bivariate distribution assumed for generating the two covariates (time-fixed and time-dependent). Possible bivariate distributions are "exponential" and "weibull" (see details below). `corr` Correlation parameter. Possible values for the bivariate exponential distribution are between -1 and 1 (0 for independency). Any value between 0 (not included) and 1 (1 for independency) is accepted for the bivariate weibull distribution. `dist.par` Vector of parameters for the allowed distributions. Two (scale) parameters for the bivariate exponential distribution and four (2 shape parameters and 2 scale parameters) for the bivariate weibull distribution: (shape1, scale1, shape2, scale2). See details below. `model.cens` Model for censorship. Possible values are "uniform" and "exponential". `cens.par` Parameter for the censorship distribution. Must be greater than 0. `beta` Vector of two regression parameters for the two covariates. `lambda` Parameter for an exponential distribution. An exponential distribution is assumed for the baseline hazard function.

## Details

The bivariate exponential distribution, also known as Farlie-Gumbel-Morgenstern distribution is given by

F(x,y)=F_1(x)F_2(y)[1+α(1-F_1(x))(1-F_2(y))]

for x≥0 and y≥0. Where the marginal distribution functions F_1 and F_2 are exponential with scale parameters θ_1 and θ_2 and correlation parameter α, -1 ≤ α ≤ 1.

The bivariate Weibull distribution with two-parameter marginal distributions. It's survival function is given by

S(x,y)=P(X>x,Y>y)=exp^(-[(x/θ_1)^(β_1/δ)+(y/θ_2)^(β_2/δ)]^δ)

Where 0 < δ ≤ 1 and each marginal distribution has shape parameter β_i and a scale parameter θ_i, i = 1, 2.

## Value

An object with two classes, `data.frame` and `TDCM`. To accommodate time-dependent effects, we used a counting process data-structure, introduced by Andersen and Gill (1982). In this data-structure, apart the time-fixed covariates (named `covariate`), an individual's survival data is expressed by three variables: `start`, `stop` and `event`. Individuals without change in the time-dependent covariate (named `tdcov`) are represented by only one line of data, whereas patients with a change in the time-dependent covariate must be represented by two lines. For these patients, the first line represents the time period until the change in the time-dependent covariate; the second line represents the time period that passes from that change to the end of the follow-up. For each line of data, variables `start` and `stop` mark the time interval (start, stop) for the data, while event is an indicator variable taking on value 1 if there was a death at time stop, and 0 otherwise. More details about this data-structure can be found in papers by (Meira-Machado et al., 2009).

## Author(s)

Artur Araújo, Luís Meira Machado and Susana Faria

## References

Anderson, P.K., Gill, R.D. (1982). Cox's regression model for counting processes: a large sample study. Annals of Statistics, 10(4), 1100-1120. doi: 10.1214/aos/1176345976

Cox, D.R. (1972). Regression models and life tables. Journal of the Royal Statistical Society: Series B, 34(2), 187-202. doi: 10.1111/j.2517-6161.1972.tb00899.x

Johnson, M. E. (1987). Multivariate Statistical Simulation, John Wiley and Sons.

Johnson, N., Kotz, S. (1972). Distribution in statistics: continuous multivariate distributions, John Wiley and Sons.

Lu J., Bhattacharya G. (1990). Some new constructions of bivariate weibull models. Annals of Institute of Statistical Mathematics, 42(3), 543-559. doi: 10.1007/BF00049307

Meira-Machado, L., Cadarso-Suárez, C., De Uña- Álvarez, J., Andersen, P.K. (2009). Multi-state models for the analysis of time to event data. Statistical Methods in Medical Research, 18(2), 195-222. doi: 10.1177/0962280208092301

Meira-Machado L., Faria S. (2014). A simulation study comparing modeling approaches in an illness-death multi-state model. Communications in Statistics - Simulation and Computation, 43(5), 929-946. doi: 10.1080/03610918.2012.718841

Meira-Machado, L., Sestelo M. (2019). Estimation in the progressive illness-death model: a nonexhaustive review. Biometrical Journal, 61(2), 245–263. doi: 10.1002/bimj.201700200

Therneau, T.M., Grambsch, P.M. (2000). Modelling survival data: Extending the Cox Model, New York: Springer.

`genCMM`, `genCPHM`, `genTHMM`.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```tdcmdata <- genTDCM(n=1000, dist="weibull", corr=0.8, dist.par=c(2,3,2,3), model.cens="uniform", cens.par=2.5, beta=c(-3.3,4), lambda=1) head(tdcmdata, n=20L) library(survival) fit1<-coxph(Surv(start,stop,event)~tdcov+covariate,data=tdcmdata) summary(fit1) tdcmdata2 <- genTDCM(n=1000, dist="exponential", corr=0, dist.par=c(1,1), model.cens="uniform", cens.par=1, beta=c(-3,2), lambda=0.5) head(tdcmdata2, n=20L) fit2<-coxph(Surv(start,stop,event)~tdcov+covariate,data=tdcmdata2) summary(fit2) ```