knitr::opts_chunk$set(echo = TRUE)
library(knitr)
setwd("C:/Users/Si Cheng/OneDrive - UW/19winter/biomarker/SurvPET/")
#setwd("C:/Users/Si Cheng/OneDrive - UW/kidney_biomarker/suppl/")

Update in the newest version (v4, 9/2019) Added the functionality of simulating datasets containing biomarker and survival observations. The R function currently allows for constant baseline hazard.

Update history v3, 5/21/2019: Incorporated an alternative method for calculating event rates. The method comes from Heagerty et al (2013), and uses a kernel smoothed version of Kaplan-Meier survival estimators. This method allows the censoring process to be dependent on the biomarker, and guarantees monotone ROC curves (if the user is interested in the prognostic capacity of a biomarker represented by time-dependent ROC curves).

Prognostic Enrichment is a clinical trial strategy of evaluating an intervention in a patient population with a higher rate of the unwanted event than the broader patient population (R. Temple (2010) ). A higher event rate translates to a lower sample size for the clinical trial, which can have both practical and ethical advantages. The package BioPET-Surv provides tools to evaluate biomarkers for prognostic enrichment of clinical trials with survival or time-to-event outcomes. (Most parts of this paragraph come from the documentation of BioPET)

Key functions of this package are:

Loading the functions and data

### The first two lines are from the legacy GitHub repository "SurvPET"
#source("https://raw.githubusercontent.com/chengs94/SurvPET/master/surv_enrichment.R")
#source("https://raw.githubusercontent.com/chengs94/SurvPET/master/surv_plot_enrichment.R")
source("https://raw.githubusercontent.com/chengs94/BioPETsurv/master/R/surv_enrichment.R")
source("https://raw.githubusercontent.com/chengs94/BioPETsurv/master/R/surv_plot_enrichment.R")
source("https://raw.githubusercontent.com/chengs94/BioPETsurv/master/R/sim_data.R")
load("sim_data.RData")

Simulating a dataset

To call sim_data:

sim_data(n = 500, covariates = NULL, beta = NULL, biomarker = "normal", effect.size = 0.25,
         baseline.hazard = "constant", end.time = 10, end.survival = 0.5, prob.censor = 0.2,
         seed = 2333)

This function simulates a dataset given distributions of the biomarker and the relation between survival and biomarker values.

Explanation for function arguments:

Example: we simulate a dataset with two covariates and a normally-distributed biomarker, with effect size 0.25 (HR=exp(0.25)=1.28) and a constant baseline hazard.

covariates <- matrix(rnorm(500*2), ncol=2)
beta <- c(0.2, 0.3)
dat <- sim_data(n=500, covariates = covariates, beta = beta, biomarker = "normal", effect.size = 0.25,
                baseline.hazard = "constant", end.time = 10, end.survival = 0.5, prob.censor = 0.2)
head(dat)
mean(dat$event)
dat$surv <- Surv(dat$time.observed, dat$event)
fit <- coxph(surv~biomarker+x1+x2, data=dat)
summary(fit)

The proportion of censoring and the estimated effect sizes from coxph match the arguments specified in the function call.

As an alternative to a simulated dataset, the user can use the built-in dataset, which contains 1533 observations of three variables: two biomarkers ($X_1$ and $X_2$) and the survival outcome (time to event and indicator of event).

head(sim.data)

Prognostic enrichment with real data

To call surv_enrichment:

surv_enrichment(formula, data, hr = 0.8, end.of.trial=NULL, a=NULL, f=NULL,
               method = "KM", lambda = 0.05,
               cost.screening = NULL, cost.keeping = NULL, cost.unit.keeping = NULL,
               power = 0.9, alpha = 0.05, one.sided = F,
               selected.biomarker.quantiles = seq(from = 0, to = 0.95, by = 0.05),
               do.bootstrap = FALSE, n.bootstrap = 1000, seed = 2333,
               print.summary.tables = FALSE)

This function applies to two types of trials.

Explanation for arguments (all are required if not specified otherwise):

Example 1: Use biomarker x1 to enrich the trial, and consider trials lasting 36 and 48 months respectively. All patients are followed for the same duration. The cost for measuring one patient's biomarker level is 50, and the costs for one patient in a 36 and 48-month trial are 800 and 1000 respectively.

result1 <- surv_enrichment(formula = surv~x1, sim.data, hr = 0.8, end.of.trial = c(36,48),
                           cost.screening = 50, cost.keeping = c(800,1000),
                           power = 0.9, alpha = 0.05, one.sided = F,
                           selected.biomarker.quantiles = seq(from = 0, to = 0.98, by = 0.02),
                           print.summary.tables = F)
names(result1)

The output is a list with the contents above. The first argument is the table with summary statistics. The 2nd to 11th elements are estimates and standard errors for: 1) event probability at the end of trial; 2) sample size required; 3) total number of patients to measure for biomarker levels to achieve such sample size; 4) total cost (screening + trial); 5) reduction (%) in cost comparing to no enrichment. These are all matrices, where each row corresponds to one enrichment level, and the columns correspond to different duration of trials.

The last 8 arguments correspond to the input of user. acc.fu is an indicator of whether this trial is considered as type "accrual + follow-up".

Example 2: Use biomarker x2 to enrich the trial, which has an accrual period of 24 months and a follow-up period of 24 months. A patient will no longer be in the trial (i.e. no longer costs) after experiencing an event. The cost per month for one patient is 20. Bootstrap standard errors are computed with 200 bootstrap samples (only for illustration, we do not generate a large number of bootstrap draws).

result2 <- surv_enrichment(formula = surv~x2, sim.data, hr = 0.8, a=24, f=24,
                           cost.screening = 50, cost.unit.keeping = 20,
                           power = 0.9, alpha = 0.05, one.sided = F,
                           selected.biomarker.quantiles = seq(from = 0, to = 0.98, by = 0.02),
                           do.bootstrap = T, n.bootstrap = 200, seed = 233,
                           print.summary.tables = T)

Example 3: Use biomarker x1 to enrich the trial, and consider a fixed-length trial lasting 48 months. The cost for measuring one patient's biomarker level is 50, and the costs for one patient is 1000. NNE estimates are used for survival probabilities.

result3 <- surv_enrichment(formula = surv~x1, sim.data, hr = 0.8, end.of.trial = 48,
                           method = "NNE", lambda=0.05,
                           cost.screening = 50, cost.keeping = 1000,
                           power = 0.9, alpha = 0.05, one.sided = F,
                           selected.biomarker.quantiles = seq(from = 0, to = 0.9, by = 0.1),
                           print.summary.tables = T)

Visualizing results from an enrichment analysis

The function surv_plot_enrichment makes a set of plots based on an object x returned by surv_enrichment. To call this function:

surv_plot_enrichment(x, km.quantiles = c(0,0.25,0.5,0.75),
                    km.range = NULL, alt.color = NULL)

km.quantiles are the levels of enrichment that the user wants to compare in Kaplan-Meier survival curves. km.range is the range of time in Kaplan-Meier survival plot (taken as the last time point of observation by default). alt.color allows the user to specify the color of curves (taken as default ggplot2 colors if not specified).

If multiple durations of trials are considered to obtain x, they will be plotted together for comparison. In this situation, the length of alt.color must match that of end.of.trial. If the user wishes to compare multiple biomarkers/power of trial in the same sets of plots, they can use the outputs from surv_enrichment and manually construct the plots.

If x contains standard error estimates, error bars will be added to the plots.

Example 1 cont'd: plots with customized color

plots1 <- surv_plot_enrichment(result1, alt.color = c("salmon","royalblue"))
names(plots1)

This function returns six plots: 1) Kaplan-Meier survival curves of patients with biomarker levels above certain quantiles; 2) event rate at the end of trial; 3) sample size required; 4) number of patients that need to be screened; 5) screening + trial cost; 6) reduction in cost. A combination of these six plots will automatically be printed (as above).

Example 2 cont'd: use range $t=0$ to $t=60$ in Kaplan-Meier plot

plots2 <- surv_plot_enrichment(result2, km.range = 60)

We then conduct a call similar to Example 2, except that biomarker x1 is used. Manually plotting result2 and this result together gives:

include_graphics("C:/Users/Si Cheng/OneDrive - UW/19winter/biomarker/SurvPET/temp_plots/SurvPET_sim_example.jpeg")

Example 3 cont'd: plots of trial statistics using NNE estimates.

plots3 <- surv_plot_enrichment(result3)


chengs94/BioPETsurv documentation built on Nov. 25, 2019, 9:45 a.m.