simu: Function to generate simulated data used in the manuscript.

Description Usage Arguments Value Examples

View source: R/simu.R

Description

This function is used to generate simulated data under various settings. Let Z be a p-dimensional vector of possible time-dependent covariates and β be the vector of regression coefficient. The survival times (T) are generated from the hazard function specified as follow:

Scenario 1.1

Proportional hazards model:

λ(t|Z) = λ_0(t) e^{-0.5 Z_1 + 0.5 Z_2 - 0.5 Z_3 ... + 0.5 Z_{10}},

where λ_0(t) = 2t.

Scenario 1.2

Proportional hazards model with noise variable:

λ(t|Z) = λ_0(t) e^{2Z_1 + 2Z_2 + 0Z_3 + ... + 0Z_{10}},

where λ_0(t) = 2t.

Scenario 1.3

Proportional hazards model with nonlinear covariate effects:

λ(t|Z) = λ_0(t) e^{[2\sin(2π Z_1) + 2|Z_2 - 0.5|]},

where λ_0(t) = 2t.

Scenario 1.4

Accelerated failure time model:

\log(T) = -2 + 2Z_1 + 2Z_2 + ε,

where ε follows N(0, 0.5^2).

Scenario 1.5

Generalized gamma family:

T = e^{σω},

where ω = \log(Q^2 g) / Q, g follows Gamma(Q^{-2}, 1), σ = 2Z_1, Q = 2Z_2.

Scenario 2.1

Dichotomous time dependent covariate with at most one change in value:

λ(t|Z(t)) = λ_0(t)e^{2Z_1(t) + 2Z_2},

where Z_1(t) is the time-dependent covariate: Z_1(t) = θ I(t ≥ U_0) + (1 - θ) I(t < U_0), ,θ is a Bernoulli variable with equal probability, and U_0 follows a uniform distribution over [0, 1].

Scenario 2.2

Dichotomous time dependent covariate with multiple changes:

λ(t|Z(t)) = e^{2Z_1(t) + 2Z_2},

where Z_1(t) = θ[I(U_1≤ t < U_2) + I(U_3 ≤ t)] + (1 - θ)[I(t < U_1) + I(U_2≤ t < U_3)], θ is a Bernoulli variable with equal probability, and U_1≤ U_2≤ U_3 are the first three terms of a stationary Poisson process with rate 10.

Scenario 2.3

Proportional hazard model with a continuous time dependent covariate:

λ(t|Z(t)) = 0.1 e^{Z_1(t) + Z_2},

where Z_1(t) = kt + b, k and b are independent uniform random variables over [1, 2].

Scenario 2.4

Non-proportional hazards model with a continuous time dependent covariate:

λ(t|Z(t)) = 0.1 \cdot[1 + \sin\{Z_1(t) + Z_2\}],

where Z_1(t) = kt + b, k and b follow independent uniform distributions over [1, 2].

Scenario 2.5

Non-proportional hazards model with a nonlinear time dependent covariate:

λ(t|Z(t)) = 0.1 \cdot[1 + \sin\{Z_1(t) + Z_2\}],

where Z_1(t) = 2kt\cdot \{I(t > 5) - 1\} + b, k and b follow independent uniform distributions over [1, 2].

The censoring times are generated from an independent uniform distribution over [0, c], where c was tuned to yield censoring percentages of 25

Usage

1
2
3
4
5
simu(n, cen, scenario, summary = FALSE)

trueHaz(dat)

trueSurv(dat)

Arguments

n

an integer value indicating the number of subjects.

cen

is a numeric value indicating the censoring percentage; three levels, 0%, 25%, 50%, are allowed.

scenario

can be either a numeric value or a character string. This indicates the simulation scenario noted above.

summary

a logical value indicating whether a brief data summary will be printed.

dat

is a data.frame prepared by simu.

Value

simu returns a data.frame. The returned data.frame consists of columns:

id

is the subject id.

Y

is the observed follow-up time.

death

is the death indicator; death = 0 if censored.

z1–z10

is the possible time-independent covariate.

k, b, U

are the latent variables used to generate $Z_1(t)$ in Scenario 2.1 – 2.5.

The returned data.frame can be supply to trueHaz and trueSurv to generate the true cumulative hazard function and the survival function, respectively.

Examples

1
2
3
4
5
set.seed(1)
simu(10, 0.25, 1.2, TRUE)

set.seed(1)
simu(10, 0.50, 2.2, TRUE)

rocTree documentation built on Aug. 1, 2020, 5:06 p.m.