modelSim: Data simulation from different survival models

View source: R/ModelSim.R

modelSimR Documentation

Data simulation from different survival models

Description

Data simulation from different survival models

Usage

modelSim(
  model = "cox",
  matDistr,
  matParam,
  n,
  p,
  pnonull,
  betaDistr,
  hazDistr,
  hazParams,
  seed,
  Phi = NULL,
  d = 0,
  pourc = 0.9
)

Arguments

model

Survival model: "cox", "AFT", "AFTshift" or "AH"

matDistr

Distribution of matrix

matParam

Parameters of matrix

n

size of sample

p

number of parameters

pnonull

number of partinent covariates

betaDistr

Distribution of beta or vector of beta

hazDistr

distribution of baseline hazard

hazParams

Parameters of baseline hazard

seed

seed

Phi

nonlinearity (not coded)

d

censorship

pourc

pourcents

Details

This function simulates survival data from different models: Cox model, AFT model and AH model. 1. The Cox model is defined as: λ(t|X) = α_0(t) \exp(β^T X_{i.}), with α_0(t) is the baseline risk and β is the vector of coefficients. Two distributions are considered for the baseline risk:

  • Weibull: α_0(t) = λ a t^{(a-1)};

  • Log-normal: α_0(t) = (1/(σ√(2π t) \exp[-(\log t - μ)^2 /2 σ^2]))/(1 - Φ[(\log t - μ)/σ]);

  • Exponential: α_0(t) = λ;

  • Gompertz: α_0(t) = λ \exp(α t).

To Simulate the covariates, two distributions are also proposed:

  • Uniform

  • Normal

and the choice of parameters The Phi parameter enables to simulate survival data in a linear framework with no interaction, but its future implementation will take into account a non-linear framework with interactions. If the parameter Phi is NULL (to complete...).

2. The AFT model is defined from a linear regression of the interest covariate: Y_i = X_{i.} β + W_i, with X_{i.} the covariates, β the vector of regression coefficients et ε_i the error term AFT model can also be defined from the baseline survival function S_0(t), corresponding distribution tail \exp(ε_i). Survival function of AFT model is written as: S(t|{X_{i.}}) = S_0(t\exp{(β^T X_{i.})}), and the expression of hazard risk is the form of: λ(t|X_{i.}) = \exp(β^T X_{i.}) α_0(t\exp(β^T X_{i.})). \label{eq:riskAFT} with α_0(t) is the baseline risk and β is the vector of coefficients. The advantage of AFT model is that the variables have a multiplicative effect on t rather than on the risk function, as is the case in Cox model. Two distributions are considered for the baseline risk:

  • Weibull: α_0(t) = λ a t^{(a-1)};

  • Log-normal: α_0(t) = (1/(σ√(2π t) \exp[-(\log t - μ)^2 /2 σ^2]))/(1 - Φ[(\log t - μ)/σ])

.

To Simulate the covariates, two distributions are also proposed:

  • Uniform

  • Normal

and the choice of parameters The Phi parameter enables to simulate survival data in a linear framework with no interaction, but its future implementation will take into account a non-linear framework with interactions. If the parameter Phi is NULL (to complete...). 3. The hazard risk of the AH model is defined for an individual i as: λ_{AH}(t|X_{i.}) = α_0(t\exp(β^T X_{i.})), with α_0 the baseline risk and β the vector of regression parameters. In a model with only one binary variable considered that corresponds to the treatment, the hazard risk is written as follows: λ_1(t) = α_0(β t). with α_0 the baseline risk and β the vector of regression parameters. In a model with only one binary variable considered that corresponds to the treatment, the hazard risk is written as follows: λ_1(t) = α_0(β t). The regression vector β characterizes the influence of variables on the survival time of individuals, and \exp(β^TX_{i.}) is a factor altering the time scale on hazard risk. The positive or negative value of β^T X_{i.} will respectively imply an acceleration or deceleration of the risk.The AH model is defined from a linear regression of the interest covariate: Two distributions are considered for the baseline risk:

  • Weibull: α_0(t) = λ a t^{(a-1)};

  • Log-normal: α_0(t) = (1/(σ√(2π t) \exp[-(\log t - μ)^2 /2 σ^2]))/(1 - Φ[(\log t - μ)/σ]).

To Simulate the covariates, two distributions are also proposed:

  • Uniform

  • Normal

and the choice of parameters The Phi parameter enables to simulate survival data in a linear framework with no interaction, but its future implementation will take into account a non-linear framework with interactions. If the parameter Phi is NULL (to complete...).

sim$model <- model

Value

modelSim returns a list containing:

  • model model (Cox, AFT, AFTshift, AH)

  • Z Matrix of covariates

  • Y random covariates

  • TC Vector of survival times

  • delta Vector of censorship indicator

  • betanorm Vector of normalized regression parameter

  • crate Censorship rate

  • crate_delta Censorship rate

  • vecY Vector of number of individuals at risk at time t_i

  • hazParams Vector of parameter distribution of the baseline hazard function

  • hazDistr Distribution of the baseline hazard function

  • St Matrix of survival functions

  • ht Matrix of hazard risk functions

  • grilleTi Time grid

Author(s)

Mathilde Sautreuil

See Also

print.modSim, plot.modSim

Examples

## Not run: 
library(survMS)
### Survival data simulated from Cox model
res_paramW = get_param_weib(med = 2228, mu = 2325)
listCoxSim_n500_p1000 <- modelSim(model = "cox", matDistr = "unif", matParam = c(-1,1), n = 500,
                                p = 1000, pnonull = 20, betaDistr = 1, hazDistr = "weibull",
                                hazParams = c(res_paramW$a, res_paramW$lambda), seed = 1, d = 0)
print(listCoxSim_n500_p1000)
hist(listCoxSim_n500_p1000)
plot(listCoxSim_n500_p1000, ind = sample(1:500, 5))
plot(listCoxSim_n500_p1000, ind = sample(1:500, 5), type = "hazard")

df_p1000_n500 = data.frame(time = listCoxSim_n500_p1000$TC,
                          event = listCoxSim_n500_p1000$delta,
                          listCoxSim_n500_p1000$Z)
df_p1000_n500[1:6,1:10]
dim(df_p1000_n500)
### Survival data simulated from AFT model
res_paramLN = get_param_ln(var = 200000, mu = 1134)
listAFTSim_n500_p1000 <- modelSim(model = "AFT", matDistr = "unif", matParam = c(-1,1), n = 500,
                                p = 100, pnonull = 100, betaDistr = 1, hazDistr = "log-normal",
                                hazParams = c(res_paramLN$a, res_paramLN$lambda),
                                Phi = 0, seed = 1, d = 0)
hist(listAFTSim_n500_p1000)
plot(listAFTSim_n500_p1000, ind = sample(1:500, 5))
df_p1000_n500 = data.frame(time = listAFTSim_n500_p1000$TC,
                           event = listAFTSim_n500_p1000$delta,
                           listAFTSim_n500_p1000$Z)
df_p1000_n500[1:6,1:10]
dim(df_p1000_n500)

### Survival data simulated from AH model
res_paramLN = get_param_ln(var=170000, mu=2325)
listAHSim_n500_p1000 <- modelSim(model = "AH", matDistr = "unif", matParam = c(-1,1), n = 500, 
                                 p = 100, pnonull = 100, betaDistr = 1.5, hazDistr = "log-normal",
                                 hazParams = c(res_paramLN$a*4, res_paramLN$lambda),
                                 Phi = 0, seed = 1, d = 0)
                                 
print(listAHSim_n500_p1000)
hist(listAHSim_n500_p1000)
plot(listAHSim_n500_p1000, ind = sample(1:500, 5))
plot(listAHSim_n500_p1000, ind = sample(1:500, 5), type = "hazard")

## End(Not run)

mathildesautreuil/survMS documentation built on June 13, 2022, 4:07 p.m.