twostage_simulator: This function simulates a specified number of seamless trials...

View source: R/twostage_simulator.R

twostage_simulatorR Documentation

This function simulates a specified number of seamless trials for each design configuration provided.

Description

twostage_simulator is the primary workhorse of seamlesssim. It simulates complex seamless Phase I/II oncology trials as discussed in the article by Boonstra et al. (2021). It allows clinical trialists to determine operating characteristics of trials that assess both toxicity and efficacy with a range of different design and analytic approaches. For more detailed information, see Boonstra et al. and the vignette.

Usage

twostage_simulator(
  array_id = 1,
  n_sim,
  primary_objectives,
  dose_outcome_curves,
  design_list,
  stan_args = NA,
  sim_labels = NULL,
  design_labels = NULL,
  do_efficient_simulation = TRUE,
  verbose = F,
  random_seed = 1,
  stan_seed = 1
)

Arguments

array_id

A positive integer identifier that will be appended as a column, without modification, to all results. This is meant to be helpful to the user when calling this function multiple times, e.g. in parallel.

n_sim

A positive integer indicating how many simulated trials to conduct for each design configuration.

primary_objectives

A vector containing three named elements: tox_target, tox_delta_no_exceed, and eff_target, such that tox_target is between 0 and 1, tox_delta_no_exceed is between 0 and (1 - tox_target), and eff_target is between 0 and 1. These choices delineate the primary objectives of all designs to be simulated. The true MTD is defined as the dose level with true probability of DLT closest to tox_target but not exceeding tox_target + tox_delta_no_exceed, and true acceptable dose level(s) are defined as any dose level that is less than or equal to the MTD with a true probability of response at least as large as eff_target. Each design will recommend the estimated MTD if its estimated efficacy probability is at least eff_target, and otherwise recommend no dose level.

dose_outcome_curves

A list containing three named elements and an optional fourth element: tox_curve, eff_curve, scenario, and, optionally, eff_curve_stage2. tox_curve is the true toxicity curve of the doses; eff_curve is the true efficacy curve of the doses; scenario is an identifier of which true data-generating scenario is being run (meant to be helpful to the user when calling this function multiple times for different data-generating scenarios).

design_list

A list specifying all specific module choices. It will be a list of lists of lists. The highest level of the list corresponds to each overall design to to be evaluated; this should be as long as the number of designs that the user wants to compare. The next level of the list gives the list of module choices for each design. It must have a named component module1 and will optionally have named components module2...module4, taken from the bolded values of Figure 1 in the manuscript referenced above. If any of module2 to module4 are not provided, they are assumed to correspond to a choice of module2 = list(name = "none"). Finally, the lowest level of the list gives the list of choices for each particular module. Each list must have one entry named "name" to indicate the choice of module, and also a value for every argument that is specific to that module. See the vignette for examples.

If a user has just one design in mind, they may instead provide a list of lists.

stan_args

A list containing eight named elements for the Bayesian isotonic regression. For users without familiarity with STAN, stan_args can be left as NA (the default), and the defaults will all be used. Alternatively, users can modify any/all of these arguments, leaving the others as defaults or NA:

n_mc_warmup

A positive integer giving the number of desired warmup runs; the default is 1000

n_mc_samps

A positive integer giving the number of additional samples to run after warmup is completed; the default is 2000

mc_chains

A positive integer indicating the number of chains to run in parallel, which will multiply the final number of samples; the default is 4

mc_thins

A positive integer indicating the number of iterations to thin by (increasing thinning will decrease the final number of samples); the default is 1

mc_stepsize

A numeric value between 0 and 1 that is passed to control in the call to stan() as the stepsize argument; the default is 0.1

mc_adapt_delta

A numeric value between 0 and 1 that is passed to control in the call to stan() as the adapt_delta argument; the default is 0.8

mc_max_treedepth

A positive integer passed to control in the call to stan() as the max_treedepth argument; the default is 15

ntries

A positive integer. The stan algorithm throws warnings about divergent transitions, which are indicative of an inability to fully explore the posterior space. Sometimes this number can be extremely large, which suggests that the fitted model needs to be reparametrized. However, in this case, divergent transitions seem to be sporadic. ntries indicates how many reruns of the algorithm should be tried when > 0 divergent transitions are encountered. The run with the fewest such transitions is kept. The default is 2.

For users without familiarity with STAN who still wish to use Bayesian isotonic regression, some or all of these arguments may be left as NA, and default specifications will be used.

sim_labels

A vector of anything but must be as long as n_sim. It will be included in the final data.frame of results under a column name of sim_id. It is provided to allow the user to uniquely identify simulations and is useful when this function is used in parallel.

design_labels

A vector of anything but must be as long as length(design_list). It will be included in the final data.frame of results under a column name of design. It is provided to allow the user to uniquely identify designs and is useful when this function is used in parallel.

do_efficient_simulation

If TRUE, the simulator will run in such a way that, to the maximum possible extent, simulated data will be reused between consecutive designs. So, for example, design 1 may be identical to design 2 up to module 3, in which case the data from modules 1 and 2 can be reused from design 1 to design 2. If FALSE, each design will be simulated independently of each other design, but the whole simulator will take longer to run.

verbose

If TRUE, the simulator will print all Stan and other function output; if FALSE, it will not. The default is FALSE.

random_seed

A positive integer seed set prior to starting the simulations.

stan_seed

A positive integer used to randomly select the initial values for Stan sampling. The default is 1.

Value

The function returns a named list with entries:

patient_data

A data.frame with number of rows equal to number of individual patients simulated across all simulations of all designs, i.e. if every single design were to enroll the maximum possible number of patients, say, n, the number of rows would be n * length(design_list) * n_sim.

sim_data_stage1

A data.frame with number of rows equal to length(design_list) * n_sim, i.e. one per design per simulation. It gives trial-level summary information about the status of the trial at the end of module 2, which is also the end of stage 1, of each design. Some key columns to note are (i) estMTD: the dose level estimated to be the MTD at the end of stage 1; (ii) estMTDCode: either '1Y' (where the '1' refers to the stage and the 'Y' refers to the module not stopping for toxicity) or '1N' (where the 'N' refers to the module stopping for toxicity); (iii) RP2D: the dose level that would be recommended right now; (iv) RP2DCode: either '1N' (where the '1' refers to the stage and the 'N' refers to the module stopping for futility) or '1Y' (where the 'Y' refers to stage 1 completing); (v) bestP2D: is the RP2D the dose with the highest efficacy that is still safe?

sim_data_stage2

A data.frame with number of rows equal to length(design_list) * n_sim, i.e. one per design per simulation. It gives trial-level summary information about the status of the trial at the end of module 4 (end of stage 2) of each design. Some key columns to note are (i) estMTD: the dose level estimated to be the MTD at the end of stage 2; (ii) estMTDCode: '1TN' (the trial stopped for toxicity during stage 1, i.e. didn't even proceed to stage 2), '1EN' (the trial stopped for futility at the end of stage 1), '2TN' (the trial stopped for toxicity during stage 2), or '2Y' (the trial completed) (iii) RP2D: the dose level that would be recommended right now; (iv) RP2DCode: '1TN' (the trial stopped for toxicity during stage 1, i.e. didn't even proceed to stage 2), '1EN' (the trial stopped for futility at the end of stage 1), '2TN' (the trial stopped for toxicity during stage 2), '2EN' (the trial stopped for futility at the end of stage 2), '2Y' (the trial completed) (v) bestP2D: is the RP2D the dose with the highest efficacy that is still safe?

dose_outcome_curves

The user-inputted argument to this function having the same name.

titecrm_args

The list of common arguments that were used for the crm simulator.

design_list

The user-inputted argument to this function having the same name.

design_description

A character matrix with number of rows equal to length(design_list) and number of columns equal to the total number of modules used in the trial, presumably 4. It is meant to give a concise, simple summary and comparison of each design, without going into the details of each design.

shared_design_elements

An integer matrix with number of rows equal to length(design_list) and number of columns equal to the total number of modules used in the trial, presumably 4. It gives the simulators assessment of which design elements could be recycled (therefore saving time if do_efficient_simulation==TRUE).

random_seed

The user-inputted argument to this function having the same name.

References

\insertRef

boonstra2020seamlesssim

Examples

twostage_simulator(
 n_sim = 10,
 primary_objectives =
   c(tox_target = 0.25,
     tox_delta_no_exceed = 0.05,
     eff_target = 0.70),
 dose_outcome_curves =
   list(tox_curve = c(0.10,0.15,0.25),
        eff_curve = c(0.45,0.55,0.65),
        scenario = 1),
 design_list =
   list(
     list(
       module1 = list(
         name = "crm",
         n = 25,
         starting_dose = 3,
         skeleton = c(0.10,0.15,0.25),
         beta_scale = 0.1,
         dose_cohort_size = 3,
         dose_cohort_size_first_only = TRUE,
         earliest_stop = 6),
       module4 = list(
         name = "bayes_isoreg",
         prob_threshold = 0.87,
         alpha_scale = 1e-7,
         include_stage1_data = TRUE)
     )
   )
)


elizabethchase/seamlesssim documentation built on Aug. 10, 2022, 2:55 a.m.