penPHcure.simulate: Simulation of a PH cure model with time-varying covariates

Description Usage Arguments Details Value References Examples

View source: R/penPHcure.simulate.R

Description

This function allows to simulate data from a PH cure model with time-varying covariates:

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
penPHcure.simulate(
  N = 500,
  S = seq(0.1, 5, by = 0.1),
  b0 = c(1.2, -1, 0, 1, 0),
  beta0 = c(1, 0, -1, 0),
  gamma = 1,
  lambdaC = 1,
  mean_CURE = rep(0, length(b0) - 1L),
  mean_SURV = rep(0, length(beta0)),
  sd_CURE = rep(1, length(b0) - 1L),
  sd_SURV = rep(1, length(beta0)),
  cor_CURE = diag(length(b0) - 1L),
  cor_SURV = diag(length(beta0)),
  X = NULL,
  Z = NULL,
  C = NULL
)

Arguments

N

the sample size (number of individuals). By default, N = 500.

S

a numeric vector containing the end of the time intervals, in ascending order, over which the time-varying covariates are constant (the first interval start at 0). By default, S = seq(0.1, 5, by=0.1).

b0

a numeric vector with the true coefficients in the incidence (cure) component, used to generate the susceptibility indicators. By default, b0 = c(1.2,-1,0,1,0).

beta0

a numeric vector with the true regression coefficients in the latency (survival) component, used to generate the event times. By default, beta0 = c(1,0,-1,0).

gamma

a positive numeric value, parameter controlling the shape of the baseline hazard function: λ_0(t) = γ t^{γ-1}. By default, gamma = 1.

lambdaC

a positive numeric value, parameter of the truncated exponential distribution used to generate the censoring times. By default, lambdaC = 1.

mean_CURE

a numeric vector of means for the variables used to generate the susceptibility indicators. By default, all zeros.

mean_SURV

a numeric vector of means for the variables used to generate the event-times. By default, all zeros.

sd_CURE

a numeric vector of standard deviations for the variables used to generate the susceptibility indicators. By default, all ones.

sd_SURV

a numeric vector of standard deviations for the variables used to generate the event-times. By default, all ones.

cor_CURE

the correlation matrix of the variables used to generate the susceptibility indicators. By default, an identity matrix.

cor_SURV

the correlation matrix of the variables used to generate the event-times. By default, an identity matrix.

X

[optional] a matrix of time-invariant covariates used to generate the susceptibility indicators, with dimension N by length(b0)-1L. By default, X = NULL.

Z

[optional] an array of time-varying covariates used to generate the censoring times, with dimension length(S) by length(beta) by N. By default, Z = NULL.

C

[optional] a vector of censoring times with N elements. By default, C = NULL.

Details

By default, the time-varying covariates in the latency (survival) component are generated from a multivariate normal distribution with means mean_SURV, standard deviations sd_SURV and correlation matrix cor_SURV. Otherwise, they can be provided by the user using the argument Z. In this case, the arguments mean_SURV, sd_SURV and cor_SURV will be ignored.

By default, the time-invariant covariates in the incidence (cure) component are generated from a multivariate normal distribution with means mean_CURE, standard deviations sd_CURE and correlation matrix cor_CURE. Otherwise, they can be provided by the user using the argument X. In this case, the arguments mean_CURE, sd_CURE and cor_CURE will be ignored.

Value

A data.frame with columns:

id

unique ID number associated to each individual.

tstart

start of the time interval.

tstop

end of the time interval.

status

event indicator, 1 if the event occurs or 0, otherwise.

z.?

one or more columns of covariates used to generate the survival times.

x.?

one or more columns of covariates used to generate the susceptibility indicator (constant over time).

In addition, it contains the following attributes:

perc_cure

Percentage of individuals not susceptible to the event of interest.

perc_cens

Percentage of censoring.

References

\insertRef

Hendry_2014penPHcure

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
### Example 1:
###  - event-times generated from a Cox's PH model with unit baseline hazard
###    and time-varying covariates generated from independent standard normal 
###    distributions over the intervals (0,s_1], (s_1,s_2], ..., (s_1,s_J]. 
###  - censoring times generated from an exponential distribution truncated 
###    above s_J.
###  - covariates in the incidence (cure) component generated from independent 
###    standard normal distributions.

# Define the sample size
N <- 250
# Define the time intervals for the time-varying covariates
S <- seq(0.1, 5, by=0.1)
# Define the true regression coefficients (incidence and latency)  
b0 <- c(1,-1,0,1,0)
beta0 <- c(1,0,-1,0)
# Define the parameter of the truncated exponential distribution (censoring) 
lambdaC <- 1.5
# Simulate the data
data1 <- penPHcure.simulate(N = N,S = S,
                            b0 = b0,
                            beta0 = beta0,
                            lambdaC = lambdaC)

                           
### Example 2:
###  Similar to the previous example, but with a baseline hazard function 
###   defined as lambda_0(t) = 3t^2.

# Define the sample size
N <- 250
# Define the time intervals for the time-varying covariates
S <- seq(0.1, 5, by=0.1)
# Define the true regression coefficients (incidence and latency)  
b0 <- c(1,-1,0,1,0)
beta0 <- c(1,0,-1,0)
# Define the parameter controlling the shape of the baseline hazard function
gamma <- 3
# Simulate the data
data2 <- penPHcure.simulate(N = N,S = S,
                            b0 = b0,
                            beta0 = beta0,
                            gamma = gamma)


### Example 3:
###  Simulation with covariates in the cure and survival components generated
###   from multivariate normal (MVN) distributions with specific means, 
###   standard deviations and correlation matrices.

# Define the sample size
N <- 250
# Define the time intervals for the time-varying covariates
S <- seq(0.1, 5, by=0.1)
# Define the true regression coefficients (incidence and latency)  
b0 <- c(-1,-1,0,1,0)
beta0 <- c(1,0,-1,0)
# Define the means of the MVN distribution (incidence and latency)  
mean_CURE <- c(-1,0,1,2)
mean_SURV <- c(2,1,0,-1)
# Define the std. deviations of the MVN distribution (incidence and latency)  
sd_CURE <- c(0.5,1.5,1,0.5)
sd_SURV <- c(0.5,1,1.5,0.5)
# Define the correlation matrix of the MVN distribution (incidence and latency)  
cor_CURE <- matrix(NA,4,4)
for (p in 1:4)
  for (q in 1:4)
    cor_CURE[p,q] <- 0.8^abs(p - q)
cor_SURV <- matrix(NA,4,4)
for (p in 1:4)
  for (q in 1:4)
    cor_SURV[p,q] <- 0.8^abs(p - q)
# Simulate the data
data3 <- penPHcure.simulate(N = N,S = S,
                            b0 = b0,
                            beta0 = beta0,
                            mean_CURE = mean_CURE,
                            mean_SURV = mean_SURV,
                            sd_CURE = sd_CURE,
                            sd_SURV = sd_SURV,
                            cor_CURE = cor_CURE,
                            cor_SURV = cor_SURV)


### Example 4:
###  Simulation with covariates in the cure and survival components from a 
###   data generating process specified by the user. 

# Define the sample size
N <- 250
# Define the time intervals for the time-varying covariates
S <- seq(0.1, 5, by=0.1)
# Define the true regression coefficients (incidence and latency)  
b0 <- c(1,-1,0,1,0)
beta0 <- c(1,0,-1,0)
# As an example, we simulate data with covariates following independent
#  standard uniform distributions. But the user could provide random draws 
#  from any other distribution. Be careful!!! X should be a matrix of size 
#  N x length(b0) and Z an array of size length(S) x length(beta0) x N.
X <- matrix(runif(N*(length(b0)-1)),N,length(b0)-1)
Z <- array(runif(N*length(S)*length(beta0)),c(length(S),length(beta0),N))
data4 <- penPHcure.simulate(N = N,S = S,
                            b0 = b0,
                            beta0 = beta0,
                            X = X,
                            Z = Z)


### Example 5:
###  Simulation with censoring times from a data generating process 
###   specified by the user

# Define the sample size
N <- 250
# Define the time intervals for the time-varying covariates
S <- seq(0.1, 5, by=0.1)
# Define the true regression coefficients (incidence and latency)  
b0 <- c(1,-1,0,1,0)
beta0 <- c(1,0,-1,0)
# As an example, we simulate data with censoring times following
#  a standard uniform distribution between 0 and S_J.
#  Be careful!!! C should be a numeric vector of length N.
C <- runif(N)*max(S)
data5 <- penPHcure.simulate(N = N,S = S,
                            b0 = b0,
                            beta0 = beta0,
                            C = C)
                           

penPHcure documentation built on Dec. 4, 2019, 1:08 a.m.