coxph_data_sim: Simulate data for Cox proportional hazards regression

View source: R/coxph_data_sim.R

coxph_data_simR Documentation

Simulate data for Cox proportional hazards regression

Description

coxph_data_sim simulates data for Cox proportional hazards regression models with one dichotomous independent variable based on summary statistics.

Usage

coxph_data_sim(
  n_data = 1,
  ns_c,
  ns_e,
  ne_c,
  ne_e,
  cox_hr,
  cox_hr_ci_level = 0.95,
  max_t = 100,
  cores = 1,
  ...
)

Arguments

n_data

The number of datasets to be simulated. The default is 1.

ns_c

Sample size of the control condition.

ns_e

Sample size of the experimental condition.

ne_c

Number of events (e.g., death) in the control condition.

ne_e

Number of events (e.g., death) in the experimental condition.

cox_hr

A numeric vector of length 3, indicating the hazard ratio between the experimental and control conditions based on a Cox proportional hazards regression model, the lower boundary of the x of the hazard ratio between the experimental and control conditions based on a Cox proportional hazards regression model, and the upper boundary of the x control conditions based on a Cox proportional hazards regression model, respectively. The hazard ratio must be provided. The confidence interval boundaries are optional; if missing they should be given as NA.

cox_hr_ci_level

Confidence level of the x hazard ratio between the experimental and control conditions based on a Cox proportional hazards regression model. The default is 0.95.

max_t

The maximum allowed survival/censoring time. The default is 100.

cores

The number of cores to be used in the data simulation process. The default is 1. Note that it is only useful to use more than 1 core if more than 1 dataset is simulated; ideally, n_data should be a multiple of cores.

...

Arguments passed to the control argument of psoptim (e.g. maxit, maxit.stagnate). Be aware that coxph_data_sim uses default values that are not the default in psoptim. Specifically, coxph_data_sim uses maxit = 5000, and maxit.stagnate = ceiling(maxit / 5).

Details

Particle swarm optimization, as implemented by psoptim is used to simulate one or multiple datasets that match certain summary statistics. The algorithm uses as many parameters as there cases in the dataset that is to be simulated. Therefore, using coxph_data_sim becomes more time-consuming the larger the sample size.

The relevant summary statistics that are used in the optimization process are:

  • cox_hr

    • Hazard ratio between the experimental and control conditions based on a Cox proportional hazards regression model.

    • Lower boundary of the x of the hazard ratio between the experimental and control conditions based on a Cox proportional hazards regression model.

    • Upper boundary of the x confidence interval of the hazard ratio between the experimental and control conditions based on a Cox proportional hazards regression model.

coxph_data_sim creates a list with as many elements as specified by the argument n_data. Each element consists of a list that entails the resulting simulated data and the optimization results of the data simulation process.

Value

A list of length n_data is returned. Each element of that list contains one simulated dataset and information about the optimization process:

  • data: A data.frame containing the following columns:

    • time: Survival/censoring times.

    • event: Indication of whether an event happened (1) or not (0).

    • group: Indication of whether case belongs to control condition (0) or experimental condition (1).

  • optim: Results of particle swarm optimization. See the Value section in psoptim

.

References

Harrell, F. R. (2015). Regression modeling strategies: Withapplications to linear models, logistic regression, and survival analysis (2nd ed.). Springer.

Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN'95 - International Conference on Neural Networks, 4, 1942-1948.

Shi, Y., & Eberhart, R. (1998). A modified particle swarm optimizer. 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence, 69-73.

See Also

coxph_bf and psoptim.

Examples

# Pretend we extracted the following summary statistics from an article.
ns_c <- 20
ns_e <- 56
ne_c <- 18
ne_e <- 40
cox_hr <- c(0.433, 0.242, 0.774)
cox_hr_ci_level <- 0.95

# We want to simulate 3 datasets. We do not need a very precise match of the
# summary statistics to the real summary statistics. Therefore, for
# demonstration purposes we only use 1/200 of the default number of
# optimization iterations (i.e., (1 / 200) * 5000).
sim_data <- coxph_data_sim(n_data = 3,
                           ns_c = ns_c,
                           ns_e = ns_e,
                           ne_c = ne_c,
                           ne_e = ne_e,
                           cox_hr = cox_hr,
                           cox_hr_ci_level = cox_hr_ci_level,
                           max_t = 100,
                           maxit = 25)

maxlinde/baymedr documentation built on Oct. 4, 2022, 6:27 a.m.