About This Report

pxtablek <- function(x, ...) { 
    if(class(x)=='list'){
        print(xtable::xtable(x[[1]], digits=x$digits, caption=paste('</br>',x$caption)  ),type='html', ...) 
    }else{
        print(xtable::xtable(x),type='html', ...) 
    }
}
opts_chunk$set(echo=FALSE, results='asis',fig.path='tempFiguresForKnitrReport/', fig.width=9, fig.height=7)
#BUGS? hrefs aren't working, ask leo? some of the Eqs work in RStudio, but not via knit()?
#This is just a chunk for testing it in RStudio
#source('/Users/aaronfisher/Documents/JH/Michael - Shiny App /new_gui/EAGLE Repo/eagle_gui/shinyApp/Adaptive_Group_Sequential_Design.R')
#-table1 <- table_constructor()
#Citations
library(knitcitations)
bibFile<-read.bibtex('interAdapt_noJSScode.bib') #file without \pkg{} and \progLang{} markup
cite_options(linked=FALSE) #You don't have the full links in your bib file, so the links aren't working.

This report was created using the interAdapt software for generating and analyzing trial designs with adaptive enrollment criteria. interAdapt can be accessed online at

http://spark.rstudio.com/mrosenblum/interAdapt

Additional documentation for the interAdapt, including instructions on how to download the application for offline use, can be found at

https://rawgithub.com/aaronjfisher/interAdapt/master/About_interAdapt.pdf

Table of Contents:


Introduction

In this report, we consider the scenario where we have prior evidence that the treatment might work better in a one subpopulation than another. We use the term "adaptive design" to refer to a group sequential design that starts by enrolling from both subpopulations, and then decides whether or not to continue enrolling from each subpopulation based on interim analyses. We use the term "standard designs" to refer to group sequential designs where the enrollment criteria are fixed.

Below, we describe an adaptive design in more detail, and compare the performance of this design to the performance of standard designs. Performance is compared in terms of expected sample size, expected trial duration, and power, with family-wise type I error rate set to be constant (r alpha_FWER_user_defined) for all trials.


Full List of Inputs

inputVec<-matrix(rep(NA,length=length(allVarNames)),ncol=1)
for(i in 1:length(allVarNames)) inputVec[i]<- input[[ allVarNames[i] ]]
rownames(inputVec)<-allVarLabels
pxtablek(inputVec,include.colnames=FALSE)

Decision Boundaries

boundary_adapt_plot()
pxtablek(adaptive_design_sample_sizes_and_boundaries_table())

boundary_standard_H0C_plot()
pxtablek(standard_H0C_design_sample_sizes_and_boundaries_table())

boundary_standard_H01_plot()
pxtablek(standard_H01_design_sample_sizes_and_boundaries_table())

Performance Comparison Plots

power_curve_plot()

expected_sample_size_plot()

expected_duration_plot()

Performance Comparison Table

ptab<-transpose_performance_table(performance_table())
pxtablek(ptab,include.colnames=FALSE)

Problem Description

We consider the problem of designing a randomized trial to test whether a new treatment is superior to control, for a given population (e.g., those with intracerebral hemorrhage in the MISTIE example). Consider the case where we have two subpopulations, referred to as subpopulation $1$ and subpopulation $2$, which partition the overall population of interest. These must be specified before the trial starts, and be defined in terms of participant attributes measured at baseline (e.g., having a high initial severity of disease or a certain biomarker value). We focus on situations where there is suggestive, prior evidence that the treatment may be more likely to benefit subpopulation $1$. In the MISTIE trial example, subpopulation 1 refers to small IVH participants, and subpopulation 2 refers to large IVH participants. Let $π_1$ and $π_2$ denote the proportion of the population in subpopulations 1 and 2, respectively.

Both the adaptive and standard designs discussed here involve enrollment over time, and include predetermined rules for stopping the trial early based on interim analyses. Each trial consists of $K$ stages, indexed by $k$. In stages where both subpopulations are enrolled, we assume that the proportion of newly recruited participants in each subpopulation $s \in {1,2}$ is equal to the corresponding population proportion $\pi_s$.

For a given design, let $n_k$ denote the maximum number of participants to be enrolled during stage $k$. The number enrolled during stage $k$ will be less than $n_k$ if the trial is entirely stopped before stage $k$ (so that no participants are enrolled in stage $k$) or if in the adaptive design enrollment is restricted to only subpopulation 1 before stage $k$ (as described in the Decision Rules section). For each subpopulation $s \in {1,2}$ and stage $k$, let $N_{s,k}$ denote the maximum cumulative number of subpopulation $s$ participants who have enrolled by the end of stage $k$. Let $N_{C,k}$ denote the maximum cumulative number of enrolled participants from the combined population by the end of stage $k$, i.e., $N_{C,k}=N_{1,k}+N_{2,k}$. The sample sizes will generally differ for different designs.

Let $Y_{i,k}$ be a binary outcome variable for the $i^{th}$ participant recruited in stage $k$, where $Y_{i,k}=1$ indicates a successful outcome. Let $T_{i,k}$ be an indicator of the $i^{th}$ participant recruited in stage $k$ being assigned to the treatment. We assume for each participant that there is an equal probability of being assigned to treatment ($T_{i,k}=1$) or control $(T_{i,k}=0$), independent of the participant's subpopulation. We also assume outcomes are observed very soon after enrollment, so that all outcome data is available from currently enrolled participants at each interim analysis.

For subpopulation $1$, denote the probability of a successful outcome under treatment as $p_{1t}$, and the probability of a successful outcome under control as $p_{1c}$. Similarly, for subpopulation $2$, let $p_{2t}$ denote the probability of a success under treatment, and $p_{2c}$ denote the probability of a success under control. We assume each of $p_{1c},p_{1t},p_{2c},p_{2t}$ is in the interval $(0,1)$. We define the true average treatment effect for a given population to be the difference in the probability of a successful outcome comparing treatment versus control.

In the remainder of this section we give an overview of the relevant concepts needed to understand and use interAdapt. A more detailed discussion of the theoretical context, and of the efficacy boundary calculation procedure, is provided by r citep(bibFile[["Rosenblum2013AdaptMISTIE"]]).

Hypotheses

We focus on testing the null hypothesis that, on average, the treatment is no better than control for subpopulation $1$, and the analogous null hypothesis for the combined population. Simultaneous testing of null hypotheses for these two populations was also the goal for the two-stage, adaptive enrichment designs of r citep(bibFile[["wangetal2007"]]). We define our two null hypotheses, respectively, as

interAdapt compares different designs for testing these null hypotheses. An adaptive design testing both null hypotheses (denoted $AD$) is compared to two standard designs. The first standard design, denoted $SC$, enrolls the combined population and only tests $H_{0C}$. The second standard design, denoted $SS$, only enrolls subpopulation 1 and tests $H_{01}$. All three trial designs consist of $K$ stages; the decision to entirely stop the trial early can be made at the end of any stage, based on a preplanned rule. The trials differ in that $SC$ and $SS$ never change their enrollment criteria, while $AD$ may switch from enrolling the combined population to enrolling only participants from subpopulation $1$.

The standard designs discussed here are not identical to those discussed in section 6.1 of r citep(bibFile[["Rosenblum2013AdaptMISTIE"]]), which test both hypotheses simultaneously. Implementing standard designs such as those discussed in r citep(bibFile[["Rosenblum2013AdaptMISTIE"]]) into the interAdapt software is an area of future research.

Though it is not of primary interest, we occasionally refer below to the global null hypothesis, defined to be that $p_{1t}-p_{1c}=p_{2t}-p_{2c}=0$, i.e., zero mean treatment effect in both subpopulations.

Test Statistics

Three (cumulative) z-statistics are computed at the end of each stage $k$. The first is based on all enrolled participants in the combined population, the second is based on all enrolled participants in subpopulation 1, and the third is based on all enrolled participants in subpopulation 2. Each z-statistic is a standardized difference in sample means, comparing outcomes in the treatment arm versus the control arm. Let $Z_{C,k}$ denote the z-statistic for the combined population at the end of stage $k$, which takes the following form:

[ Z_{C,k}=\left[ \frac{\sum_{k'=1}^k \sum_{i=1}^{n_{k'}}Y_{i,k'}T_{i,k'} } {\sum_{k'=1}^k \sum_{i=1}^{n_{k'}}T_{i,k'}} - \frac{\sum_{k'=1}^k \sum_{i=1}^{n_{k'}} Y_{i,k'}(1-T_{i,k'})} {\sum_{k'=1}^k \sum_{i=1}^{n_{k'}}(1-T_{i,k'})} \right] V_{C,k}^{-1/2} ]

The term in square brackets is the difference in sample means between the treatment and control groups, and $V_{C,k}$ is the variance of this difference in sample means:

[ V_{C,k}= \left( \frac{2}{ N_{C,k} } \right) \left( \sum_{s ∈ { 1,2}} π_s[p_{sc}(1-p_{sc}) + p_{st}(1-p_{st})] \right) ]

The term in square brackets is the difference in sample means between the treatment and control groups. The term in curly braces is the variance of this difference in sample means. $Z_{C,k}$ is only computed at stage $k$ if the combined population has been enrolled up through the end of stage $k$ (otherwise it is undefined). Our designs never use $Z_{C,k}$ after stages where the combined population has stopped being enrolled. Let $Z_{1,k}$ and $Z_{2,k}$ denote analogous z-statistics restricted to participants in subpopulation $1$ and subpopulation $2$, respectively. These are formally defined in r citep(bibFile[["Rosenblum2013AdaptMISTIE"]]).

Type I Error Control

The familywise (also called study-wide) Type I error rate is the probability of rejecting one or more true null hypotheses. For a given design, we say that the familywise Type I error rate is strongly controlled at level $α$ if for any values of $p_{1c},p_{1t},p_{2c},p_{2t}$ (assuming each is in the interval $(0,1)$), the probability of rejecting at least one true null hypothesis (among $H_{0C}, H_{01}$) is at most $α$. To be precise, we mean such strong control holds asymptotically, as sample sizes in all stages go to infinity, as formally defined by r citep(bibFile[["Rosenblum2013AdaptMISTIE"]]). For all three designs, $AD$, $SC$, and $SS$, we require the familywise Type I error rate to be strongly controlled at level $α$. Since the two standard designs $SS$ and $SC$ each only test a single null hypothesis, the familywise Type I error rate for each design is equal to the Type I error rate for the corresponding, single hypothesis test.

Decision rules for stopping the trial early and for modifying enrollment criteria

The decision rules for the standard design $SC$ consist of efficacy and futility boundaries for $H_{0C}$, based on the statistics $Z_{C,k}$. At the end of each stage $k$, the test statistic $Z_{C,k}$ is calculated. If $Z_{C,k}$ is above the efficacy boundary for stage $k$, the design $SC$ rejects $H_{0C}$ and stops the trial. If $Z_{C,k}$ is between the efficacy and futility boundaries for stage $k$, the trial is continued through the next stage (unless the last stage $k=K$ has been completed). If $Z_{C,k}$ is below the futility boundary for stage $k$, the design $SC$ stops the trial and fails to reject $H_{0C}$. interAdapt makes the simplification that the number of participants $n_k$ enrolled in each stage of $SC$ is a constant, denoted $n_{SC}$, that the user can set.

The efficacy boundaries for $SC$ are set to be proportional to those described by Wang and Tsiatis (1987). Specifically, the efficacy boundary for the $k^{th}$ stage is set to $e_{SC}(N_{C,k}/N_{C,K})^{\delta}$, where $K$ is the total number of stages, $δ$ is a constant in the range $[-.5,.5]$, and $e_{SC}$ is the constant computed by interAdapt to ensure the familywise Type I error rate is at most $\alpha$. Since $n_{k}$ is set equal to $n_{SC}$ for all values of $k$, the maximum cumulative sample size $N_{C,k}$ reduces to $\sum_{k'=1}^k n_{SC}=k n_{SC}$, and the boundary at stage $k$ reduces to the simpler form $e_{SC}(k/K)^\delta$. By default, interAdapt sets $\delta$ to be $-0.5$, which corresponds to the efficacy boundaries of r citep(bibFile[["obrienfleming"]]).

In order to calculate $e_{SC}$, interAdapt makes use of the fact that the random vector of test statistics ($Z_{C,1},Z_{C,2},…Z_{C,K}$) converges asymptotically to a multivariate normal distribution with a known covariance structure r citep(bibFile[["JennisonTurnbullBook"]]). Using the mvtnorm package r citep(bibFile[["mvtnorm"]]) in R to evaluate the multivariate normal distribution function, interAdapt computes the proportionality constant $e_{SC}$ to ensure the probability of $Z_{C,k}$ exceeding $e_{SC}(N_{C,k}/N_{C,K})^{\delta}$ at one or more stages $k$ is less than or equal to $α$ at the global null hypothesis defined in the Hypotheses section.

In $SC$, as well as in $SS$ and $AD$, interAdapt uses non-binding futility boundaries. That is, the familywise Type I error rate is controlled at level α regardless of whether the futility boundaries are adhered to or ignored. The motivation is that regulatory agencies may prefer non-binding futility boundaries to ensure Type I error control even if a decision is made to continue the trial despite a futility boundary being crossed.

In calculations of power, expected sample size, and expected trial duration, interAdapt assumes futility boundaries are adhered to.

Futility boundaries for the first $K-1$ stages of $SC$ are set equal to $f_{SC}(N_{C,k}/N_{C,K})^{\delta}$, where $f_{SC}$ is a proportionality constant set by the user. By default, the constant $f_{SC}$ is set to be negative (so the trial is only stopped for futility if the z-statistic is below the corresponding negative threshold), although this is not required. In the $K ^{th}$ stage of the trial, interAdapt sets the futility boundary to be equal to the efficacy boundary. This ensures that the final z-statistic $Z_{C,K}$ crosses either the efficacy boundary or the futility boundary.

The decision boundaries for the design $SS$ are defined analogously as for the design $SC$, except using z-statistics $Z_{1,k}$. interAdapt makes the simplification that the number of participants $n_k$ enrolled in each stage $k$ of $SS$ is constant, denoted by $n_{SS}$, and set by the user. The efficacy boundary for the $k^{th}$ stage is set equal to $e_{SS}(N_{1,k}/N_{1,K})^{\delta}$, where $e_{SS}$ is the constant computed by interAdapt to ensure the Type I error rate is at most $\alpha$. The first $K-1$ futility boundaries for $H_{01}$ are set equal to $f_{SS}(N_{1,k}/N_{1,K})^{\delta}$, where $f_{SS}$ is a constant that can be set by the user. The futility boundary in stage $K$ is set equal to the final efficacy boundary in stage $K$.

Consider the adaptive design $AD$. interAdapt allows the user to a priori specify a final stage at which there will be a test of the null hypothesis for the combined population, denoted by stage $k^\star$. Regardless of the results at stage $k^\star$, $AD$ always stops enrolling from subpopulation $2$ at the end stage $k^\star$. This reduces the maximum sample size of $AD$ compared to allowing enrollment from both subpopulations through the end of the trial. The futility boundaries $l_{2,k}$ are not defined for $k>k^\star$, since subpopulation 2 is not enrolled after stage $k^\star$. The user may effectively turn off the option described in this paragraph by setting $k^\star=K$, the total number of stages; then the combined population may be enrolled throughout the trial.

For the $AD$ design, the user can specify the following two types of per-stage sample sizes: one for stages where both subpopulations are enrolled $(k \leq k^\star)$, and one for stages where only participants in subpopulation 1 are enrolled $(k > k^\star)$. We refer to these two sample sizes as $n^{(1)}$ and $n^{(2)}$, respectively.

Because $AD$ simultaneously tests $H_{0C}$ and $H_{01}$ it has two sets of decision boundaries. For the $k^{th}$ stage of $AD$, let $u_{C,k}$ and $u_{1,k}$ denote the efficacy boundaries for $H_{0C}$ and $H_{01}$, respectively. The boundaries $u_{C,k}$ are set equal to $e_{AD,C}(N_{C,k}/N_{C,K})^{\delta}$ for each $k\leq k^\star$; the boundaries $u_{1,k}$ are set equal to $e_{AD,1}(N_{1,k}/N_{1,K})^{\delta}$ for each $k \leq K$. The constants $e_{AD,C}$ and $e_{AD,1}$ are set such that the probability of rejecting one or more null hypotheses under the global null hypothesis is $\alpha$ (ignoring futility boundaries). It is proved by r citep(bibFile[["Rosenblum2013AdaptMISTIE"]]) that this strongly controls the familywise Type I error rate at level $\alpha$. The algorithm for computing the proportionality constants $e_{AD,C}, e_{AD,1}$ is described later in this section.

The boundaries for futility stopping of enrollment from certain population in the $AD$ design, at the end of stage $k$, are denoted by $l_{1,k}$ and $l_{2,k}$. These stopping boundaries are defined relative to the test statistics $Z_{1,k}$ and $Z_{2,k}$, respectively. The boundaries $l_{1,k}$ and $l_{2,k}$ are set equal to $f_{AD,1}(N_{1,k}/N_{1,K})^{\delta}$ (for $k\leq K$) and $f_{AD,2}(N_{2,k}/N_{2,K})^{\delta}$ (for $k < k^\star$), respectively, where $f_{AD,1}$ and $f_{AD,2}$ can be set by the user. In stage $k^\star$, the futility boundary $l_{2,k^\star}$ is set to ''Inf'' (indicating $\infty$), to reflect that we stop enrollment in subpopulation 2. At the end of each stage, $AD$ may decide to continue enrolling from the combined population, enroll only from subpopulation 1 for the remainder of the trial, or stop the trial entirely. Specific decision rules based on these boundaries for the z-statistics are described below.

As described in r citep(bibFile[["Rosenblum2013AdaptMISTIE"]]), the decision rule in $AD$ consists of the following steps carried out at the end of each stage $k$:

The motivation for Step 2 is that there is assumed to be prior evidence that if the treatment works, it will work for subpopulation 1. Therefore, if subpopulation 1 is stopped for futility, the whole trial is stopped. It is an area of future research to consider modifications to this rule, and to incorporate testing of a null hypothesis for only subpopulation 2.

A consequence of the rule in Step 3 is that Steps 1, 2, and 4 are only carried out for stages $k\leq k^\star$. This occurs since Step 3 restricts enrollment to subpopulation 1 if $Z_{2,k} ≤ l_{2,k}$ or $k\geq k^\star$, and if so runs Steps 3a--3c through the remainder of the trial.

We next describe the algorithm used by interAdapt to compute the proportionality constants $e_{AD,C}, e_{AD,1}$ that define the efficacy boundaries $u_{C,k},u_{1,k}$. These are selected to ensure the familywise Type I error rate is strongly controlled at level $\alpha$. By Theorem~5.1 of r citep(bibFile[["Rosenblum2013AdaptMISTIE"]]), to guarantee such strong control of the familywise Type I error rate, it suffices to set $u_{C,k},u_{1,k}$ such that the familywise Type I error rate is at most $\alpha$ at the global null hypothesis defined in the Hypotheses section. The algorithm takes as input the following, which are set by the user as described in the Basic Parameters section: the per-stage sample sizes $n^{(1)},n^{(2)}$, the study-wide (i.e., familywise) Type I error rate $\alpha$, and a value $a_c$ in the interval $[0,1]$. Roughly speaking, $a_c$ represents the fraction of the study-wide Type I error $\alpha$ initially allocated to testing $H_{0C}$, as described next.

The algorithm temporarily sets $e_{AD,1}= \infty$ (effectively ruling out rejection of $H_{01}$) and computes (via binary search) the smallest value $e_{AD,C}$ such the probability of rejecting $H_{0C}$ is $a_c α$ under the global null hypothesis defined in the Hypotheses section. This defines $e_{AD,C}$. Next, interAdapt computes the smallest constant $e_{AD,1}$ such that the probability of rejecting at least one null hypothesis under the global null hypothesis is at most $\alpha$. All of the above computations use the approximation, based on the multivariate central limit theorem, that the joint distribution of the z-statistics is multivariate normal with covariance matrix as given, e.g., by r citep(bibFile[[c("JennisonTurnbullBook","Rosenblum2013AdaptMISTIE")]]).


Inputs

Basic Parameters

Advanced Parameters


References

This report was created using the knitr R package r citep(bibFile[["knitr"]]), with citations created using the knitcitations R package r citep(bibFile[["knitcitations"]]).

## Print bibliography
bibliography()


Try the interAdapt package in your browser

Any scripts or data that you put into this service are public.

interAdapt documentation built on May 2, 2019, 7:31 a.m.