In cancerpolicy/bcimodel: Model the impact of early detection and treatment interventions for breast cancer

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
knitr::opts_chunk$set(echo = TRUE)

Overview

The bcimodel package is an extension of the screentreat package, detailed in The Effect of Treatment Advances on the Mortality Results of Breast Cancer Screening Trials: A Microsimulation Model. The packages contain the same breast cancer microsimulation model allows the investigation of the impact of various breast cancer treatment and screening interventions on a virtual cohort of women.

bcimodel has an improved code structure that allows for relatively fast computation of (in theory) an unlimited number of intervention scenarios, whereas screentreat can accommodate only one intervention scenario. bcimodel also has embedded data sets of cancer incidence and all-cause mortality from global agencies as well as the United States.

This vignette will walk through an example analysis of breast cancer in Uganda.

Input data

The input data for the example are pre-loaded into the package in an object called ex1, a list with four elements: $pol, $nh, map, and $tx.

library(bcimodel)
data(ex1)

Policies to model

ex1$pol

"pol" stands for "policy", i.e. the scenario being modeled. This element of ex1 shows how to specify a series of policies to investigate. It is a data frame with the following columns:

num: The policy number
id: A short string ID'ing the policy
name: Longer description of the policy
pairnum: For policiess with no early detection, NA. For policies with early detection, the policy number of the "paired" policy that has the same treatments but does not have early detection. In the example, policies 1 & 2 do not have early detection. Policy 3 has early detection and its policy pair is #2.
earlydetHR: For policies with no early detection, 1.0. For policies with early detection, the hazard ratio (HR) on advanced-stage incidence. In policy 3, the HR of 0.7 indicates a 30% reduction in advanced-stage incidence, or a 30% stage shift.

Note that policy 1 should always be the "base case", the scenario that most closely matches the status quo/standard of care.

Policies can differ from each other on the degree of early detection and/or the treatment distribution. Treatment for each policy is specified later, in the ex1$tx object.

Natural history parameters

ex1$nh

"nh" stands for "natural history." This data frame of class naturalhist specifies the key natural history parameters of the model: population proportion and the mortality rate of each "stage-subgroup".

Two stages are allowed: "Early" or "Advanced." There is no limit on the number of subgroups or expectations for their names. They just need to be specified for each stage. In the example, there are two subgroups: "ER+" and "ER-", for a total of four stage-subgroups.

If subgroup and stage are uncorrelated and mortality rates vary only by stage, use the compile_naturalhist function to create this data frame:

# 85% are advanced stage
# Mortality rates are .05 for Early and 0.21 for Advanced
# 50% are ER+
(nh1 <- compile_naturalhist(prop_adv=0.85, 
                            mortrates=c(Early=0.05, Advanced=0.21), 
                            subgroup_probs=c(`ER+`=0.5, `ER-`=0.5)))

If you have about 15% HER2+, you could similarly specify that:

# 85% are advanced stage
# Mortality rates are .05 for Early and 0.21 for Advanced
# 50% are ER+, 15% are HER2+, so uncorrelated gives 0.5*0.15 = .075 ER+HER2+
(nh2 <- compile_naturalhist(prop_adv=0.85, 
                            mortrates=c(Early=0.05, Advanced=0.21),
                            subgroup_probs=c(`ER+HER2+`=0.075,
                                             `ER-HER2+`=0.075,
                                             `ER+HER2-`=0.425,
                                             `ER-HER2-`=0.425)))

If you do not use compile_naturalhist, you'll need to append the class naturalhist to the natural history data frame:

class(your_nh_data_frame) <- append(class(your_nh_data_frame), "naturalhist")

I probably had good intentions for making a separate class for these data, but I can't remember them now!

Stage-shift map

The purpose of the stage-shift map, e.g. ex1$map, is to keep the subgroup constant when stage is shifted due to early detection. It arranges the numeric IDs of the stage-subgroups, i.e. their row numbers in the natural history data frame (ex1$nh), in a matrix that constrains the stage-shifting.

ex1$map

In the example, advanced ER+ is stage-group 3. The shift_stages function in bcimodel (see ?bcimodel::shift_stages) will use the map to ensure that stage-shifted advanced ER+ get shifted to group 1, e.g. early ER+. Advanced ER-, group 4, will get shifted to early ER-, group 2.

The function create_stageshift_map can create a stageshift map from a natural history data frame of class naturalhist. For example, using the natural history example above where there is 15% HER2+:

create_stageshift_map(nh2)

Treatment

ex1$tx

The treatment data frame is the most complicated input, as you can see in ex1$tx. Each row represents a different stage-subgroup-specific treatment. Columns are

SSno: Stage-subgroup number. Should correspond to the stage-subgroups as they are listed in the natural history data frame, e.g. ex1$nh.
SSid: Text ID of the stage-subgroup
txSSno: Numeric ID for the treatment. Every treatment within a stage-subgroup should have a unique row and ID, e.g. "None" for early-stage ER+ is different from "None" for early-stage ER- or advanced-stage ER+.
txSSid: Text ID for the treatment.
txHR: Hazard ratio on baseline survival for the treatment. This should take into account the stage-subgroup, e.g Tamoxifen in ER- has a hazard ratio of 1.0, but in ER+ it has a hazard ratio of 0.7.
Subsequent columns: One for each of the scenarios in the policy data frame, e.g. ex1$pol, with column names corresponding to the policy "id" column (ex1$pol$id). For each policy/column, specify the proportion receiving each treatment within each stage-subgroup. So proportions sum to 1 within stage-subgroups. For example, in ex1$tx, for all stage-subgroups in the "base" scenario, 20% receive Tamoxifen and 80% receive none.

A note about "baseline" survival and treatment hazard ratios

In the "tam" scenario in ex1$tx, tamoxifen is exclusively targeted to the ER+ (100%), whereas the ER- receive no treatment. This intervention could represent two different scenarios, depending on the baseline survival parameter entered in the natural history (ex1$nh):

Baseline survival = no adjuvant treatment at all. The "tam" scenario will thus reflect a shift to useful treatment for ER+ and a shift back to no treatment at all for ER-.
Baseline survival = fully treated with chemo. Because the EBCTCG's meta-analyses of adjuvant treatment indicate that adjuvant treatment hazard ratios are multiplicative (see references in the Annals paper), your baseline survival could represent survival in the presence of chemo for all. In this case, the "tam" scenario would reflect a shift to chemo+Tamoxifen for ER+ and a shift back to chemo only for all ER-.

Note that you CANNOT have baseline survival represent survival when only some fraction of cases are treated. If those are the data available, you must use an estimate of the proportion treated to back-out untreated survival. Then, explicitly represent the % treated with chemo in the base scenario, and add tamoxifen and/or tamoxifen+chemo in the intervention.

Running the model

To run the example, see ?bcimodel::simpolicies. The default number of simulations is 5, for speed, and let's decrease the default pop size of 100,000 to 10,000 to make it even faster:

set.seed(98103)
uganda_stdpop <- bcimodel::simpolicies(ex1$pol, ex1$nh, ex1$tx, popsize=10000)

View 5-year outcomes

Default follow-up times are 5 and 10 years. Using knitr::kable for nice HTML viewing,

knitr::kable(uganda_stdpop[['5']]) # could also specify using uganda_stdpop$`5`

Parallelized version

There is a parallelized version of simpolicies() called parsimpolicies in which you specify the number of cores available. It's not worth using for a small number of sims, but it makes running 100 sims considerably faster.

However, starting the worker processes uses memory, and I think that's why I wasn't able to get parsimpolicies to work on the web app.

# Not run in vignette - three policies, 100 sims, with pop size 100,000
set.seed(98103)
uganda_stdpop <- bcimodel::simpolicies(ex1$pol, ex1$nh, ex1$tx, ncores=4)