knitr::opts_chunk$set( collapse = TRUE, cache = TRUE, cache.path = 'cache/defineArms/', comment = '#>', dpi = 300, out.width = '100%' )
library(dplyr) library(simdata) library(TrialSimulator) set.seed(12345)
In TrialSimulator
, a trial arm is defined as a collection of endpoints (and potentially other covariates or biomarkers) with a data generation process. This vignette demonstrates how to use the following key functions to define and summarize arms in a simulated clinical trial setting.
endpoint
: Creates one or more endpointsadd_endpoints
: Add one or more endpoints to an armgenerate_data
: Generates a dataset from an Arms
object (for exploratory purpose)print
: Method that displays a summary report of an Arms
object summarizing all endpoints in the armThe function endpoint
can be used to define one or multiple endpoints simultaneously. These endpoints can be independent or correlated, depending on the generator
provided. In the following hypothetical example, we construct a custom generator that simulates PFS, OS, PSA levels at baseline and year 1. A pre-specified correlation matrix ensures the endpoints are appropriately correlated. We also ensure that PFS is always less than or equal to OS.
rng <- function(n, pfs_rate, os_rate, psa_mean, psa_sd, corr_matrix){ dist <- list() dist[['PFS']] <- function(x) qexp(x, rate = pfs_rate) dist[['OS']] <- function(x) qexp(x, rate = os_rate) dist[['PSA_baseline']] <- function(x) qnorm(x, mean = psa_mean, sd = psa_sd) dist[['PSA_year1']] <- function(x) qnorm(x, mean = psa_mean - 12, sd = psa_sd) dsgn = simdata::simdesign_norta(cor_target_final = corr_matrix, dist = dist, transform_initial = data.frame, names_final = names(dist), seed_initial = 1) simdata::simulate_data(dsgn, n_obs = n) %>% mutate(PFS = pmin(PFS, OS)) %>% mutate(PFS_event = 1, OS_event = 1) }
In this generator,
The following code defines the endpoints and uses the print
method to generate a summary report based on 10,000 samples from the generator rng
.
ep1 <- endpoint(name = c('PSA_baseline', 'PSA_year1', 'OS', 'PFS'), type = c('non-tte', 'non-tte', 'tte', 'tte'), readout = c(PSA_baseline = 0, PSA_year1 = 1), generator = rng, pfs_rate = log(2)/2.5, os_rate = log(2)/4.5, psa_mean = 20, psa_sd = 4, corr_matrix = matrix(c(1, .6, -.5, -.4, .6, 1, -.4, -.3, -.5, -.4, 1, .7, -.4, -.3, .7, 1), nrow = 4)) ep1
We can define another set of endpoints using a separate call to endpoint()
. However, keep in mind that any endpoints defined separately are assumed to be independent of those in prior calls (i.e. PSA_baseline
, PSA_year1
, PFS
and OS
).
In the following example, we define a biomarker, even though it is actually not an endpoint. In practice, the function endpoint
is useful in introducing any variables, including covariates, biomarkers, sub-group indicators, etc. Ideally, a biomarker should be integrated into the generator rng
to capture meaningful correlation with other endpoints.
ep2 <- endpoint(name = 'biomarker', type = 'non-tte', readout = c(biomarker = 0), generator = rbinom, size = 1, prob = .3) ep2
We now create a treatment arm by combining ep1
and ep2
. The print
method automatically summarizes the marginal distributions of all endpoints. As seen, the summary report of the arm simply concatenates the two reports of ep1
and ep2
.
trt <- arm(name = 'treated') trt$add_endpoints(ep1, ep2) trt
We can define inclusion criteria for the arm by passing logical filter expressions via the ...
argument in arm()
. These filters are applied to the generated trial data. For example, the following code restricts enrollment to patients with
The summary report will reflect the effect of these inclusion criteria on the simulated population.
trt <- arm(name = 'treated', PSA_baseline > 10 & PSA_year1 > 0) trt$add_endpoints(ep1, ep2) trt
Although TrialSimulator
allows direct data generation using the generate_data()
method, it is generally discouraged. One of the core principles of TrialSimulator
is to separate trial simulation logic from data generation, allowing the framework to manage data generation and truncation (and/or censoring) dynamically based on trial milestones.
Nevertheless, for inspection or debugging, one can call
## not recommended tmp <- trt$generate_data(100) head(tmp)
This gives a preview of the patient-level data generated by the treatment arm configuration (generator, inclusion filters). However, enrollment schedule and dropout are not taken into account, which is another reason why we strongly discourage this way to use TrialSimulator
.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.