knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The UnplanSimon
package serves three purposes, including:
Providing an Adaptive Threshold Simon Design (ATS Simon) method for Simon's two-stage design in oncology trials when the realized sample sizes in the $1^{st}$ and/or $2^{nd}$ stage(s) are different from the planned sample sizes in the $1^{st}$ and/or $2^{nd}$ stage(s). The proposed ATS Simon design tries to follow sample sizes of the original design, to that end, this design updates the original thresholds of $(r_1, r)$ in the $1^{st}$ and/or the $2^{nd}$ stage(s) to satisfy the type I error rate as the original planned design (note: power will decrease if the realized sample size is smaller than the original one).
Providing an Adaptive Threshold and Sample Size Simon Design (ATSS Simon) method for Simon's two-stage design in oncology trials when the realized sample sizes in the $1^{st}$ and/or $2^{nd}$ stage(s) are different from the planned sample sizes in the $1^{st}$ and/or $2^{nd}$ stage(s). The proposed ATSS Simon method updates not only the original threshold of $(r_{1}^{}, r^)$ but the original sample sizes of $(n_{1}^{}, n^)$ to satisfy the type I error rate and power requirements as the original planned design (note: unlike the ATS Simon design, the sample size here will also be updated to satisfy the original power). In addition, the ATSS Simon design also satisfies the other criteria as in the originally planned design, such as minimizing the average sample size under the null hypothesis $H_0$.
Providing comprehensive post-trial inference tools at the end of the trial for Simon's two-stage design when under- or over-enrollment occurs. This includes the computation of point estimates, confidence intervals, and p-values for the proposed ATS and ATSS Simon design methods.
This vignette introduces functionalities in {UnplanSimon} tailored for
designing single-arm clinical trials using the proposed method. Examples featuring
Simon's two-stage designs from {clinfun} demonstrate the application of these
functions within our package.
To begin, install and load {UnplanSimon} and {clinfun}:
library(clinfun) library(UnplanSimon)
Simon's two-stage design is a statistical method used in clinical trials, particularly in Phase II studies, to evaluate the efficacy of a new treatment while minimizing the number of patients exposed to potentially ineffective treatments. It consists of two stages:
Stage 1:
-A predetermined number of patients ($n_1$) are enrolled and treated.
-If the number of responses (positive outcomes) in these patients meets or exceeds a
certain threshold ($r_1$), the trial continues to the second stage.
-If the number of responses is equal to or below this threshold, the trial is stopped early
for futility, concluding that the treatment is ineffective.
Stage 2:
-An additional number of patients ($n_2$) are enrolled and treated.
-The total number of responses from both stages is then evaluated against a
final threshold ($r$).
-If the total number of responses exceeds the final threshold, the
treatment is considered promising for further study.
-If not, the treatment is considered ineffective.
Simon's two-stage design aims to reduce the number of patients receiving ineffective treatment while ensuring that promising treatments are identified efficiently. It balances the need for early stopping in case of futility with the need for sufficient data to make a reliable decision about the treatment's efficacy (Simon, 1989).
From the above description, when designing a trial using Simon's two-stage design, there are four key design parameters that need to be specified:
$r_1$: Threshold in stage 1
$r$: Threshold in stage 2
$n_1$: Number of patients in stage 1
$n$: Number of patients in stages 1 and 2, e.g., ($n_1 + n_2$)
In order to determine the design parameters for Simon's two-stage design, the followings are required:
$p_0$: Unacceptable efficacy rate
$p_1$: Desirable efficacy rate
$\alpha$: Type I error rate
$\beta$: Type II error rate
That is,
$H_0$: The true treatment response rate is less than or equal to some unacceptable level ($p \leq p_0$)
$H_1$: The true treatment response rate is greater than or equal to some desirable level ($p \geq p_1$)
One trial employed Simon's two-stage design, considering an overall response rate of 25\% as unacceptable and 45\% as desirable. The trial set a type I error rate at 10\% and aimed for a target power of 90\%, corresponding to a 10\% type II error rate.
From the above, we have:
$p_0$: 25% overall response rate
$p_1$: 45% overall response rate
$\alpha$: 10% type I error rate
$\beta$: 10% type II error rate
Thus, the hypotheses used for testing are:
$H_0$: The true overall response rate is less than or equal to 25% ($p \leq 0.25$)
$H_1$: The true overall response rate is greater than or equal to 45% ($p \geq 0.45%$)
Now, using ph2simon()
function from clinfun R package, specify these
parameters and print the resulting object.
library(clinfun) # Specify the parameters and constraints trial = ph2simon(0.25, 0.45, 0.1, 0.1) # Print trial
This output provides the computed design parameters ($r_1$, $r$, $n_1$, $n$) as well as $EN(p_0)$ and $PET(p_0)$ for three designs, optimal, minimax, and admissible designs.
As expected, the optimal design has the smallest expected sample size under the Null ($EN(p_0)$ = 28.36), whereas the minimax design has the smallest maximum sample size ($n$ = 39).
In single arm phase II studies, under-enrollment or over-enrollment can be common issues when using Simon’s two-stage design. For rare diseases, it can be particularly challenging to recruit additional patients quickly in cases of under-enrollment. Therefore, we propose an Adaptive Threshold Simon design to assist clinical investigators in making decisions based on the actual sample size instead of the planned one while still adhering to the original design framework when under- or over-enrollment occurs.
A thorough literature review identified numerous clinical trials
utilizing Simon's two-stage designs, where the proportion of patients
classified as inevaluable for response surpassed thresholds of 20%, 30%, and even 40%.
The factors contributing to inevaluability included patients failing to complete
the requisite number of cycles for response evaluation, being deemed ineligible
upon central review, early withdrawal due to noncompliance, among other
considerations (Ji, 2022), leading to under-enrollment during at Go/NoGo decision-making
timepoint in studies. Over-enrollment in multi-site trials often occurs due
to variations in patient recruitment rates across different locations.
It is important to note that deviations our methods address, such as under-enrollment and over-enrollment, are incidental or non-informative, indicating they arise from unforeseen circumstances or factors rather than deliberate bias or systematic error.
We identify an updated $1^{st}$ stage threshold $r_1^{}$ based on the actual $1^{st}$ stage sample size $n_{1}^$ such that the updated design's probability of early termination (PET) under the null hypothesis is the closest to the original PET under the null hypothesis. $$ P\left(Y_{1} \leq r_1^ \mid n_{1}^{}, p_{0}\right) \approx P\left(Y_{1} \leq r_1 \mid n_{1}, p_{0}\right)$$ The primary objective of Simon's two-stage design, as well as our proposed ATS or ATSS Simon design, in single arm phase II studies, is to identify and eliminate ineffective drugs as early as possible, Therefore, in cases of deviations in sample size, the updated type I error rate ($\alpha$) of a new method (including the operating characteristic, like probability of early termination) should still be close to the original one.
In the field of group sequential design, the alpha-spending function is a natural way to reallocate the type I error rate ($\alpha$). We have adopted this method in our proposed ATS Simon's design.
Based on the identified $r_1^{}$, we then define the alpha-spending function based on the actual total sample size ($n^{}$). Specifically, we use the Lan-DeMets spending function (Lan-DeMets et al., 1983).
$$
\alpha(n^{}) =
\begin{cases}
{2-2 \Phi\left(\frac{z_{1-\alpha/2}}{(n^/n)^{1/2}}\right)} & \text{if } n^{} \leq n \
\alpha & \text{if } n^{} > n
\end{cases}
$$
where $\alpha$ here is the type I error rate in the original Simon
two-stage design.
We can see, if the actual sample size $n^{} > n$, the updated alpha $\alpha(n^{})$ will be at most $\alpha$ (the original one) and if the actual total sample size $n^{} < n$, the $\alpha(n^{})$ may be smaller than $\alpha$ due to smaller sample size. Therefore, ATS Simon's design can control the type I error rate at or below the original level. Based on the above information, we identify a smallest integer $r^{}$ as the threshold at $2^{nd}$ stage such that $$ P\left(Y_{1} > r_1^{}, Y > r^{} \mid n_{1}^{}, n^{}, p_{0}\right) \approx \alpha(n^{})$$
Here, $n_1^{}$ denotes as the actual sample size at the $1^{st}$ stage; $n^{}$ denotes as the actual sample size of the two stages.
Rationale of identifying a smallest integer $r^{}$ is that probability $P\left(Y_{1} > r_1^{}, Y > r^{} \mid n_{1}^{}, n^{}, p\right)$ is a decreasing function of the threshold $r^{}$ in stage 2. So, if we find such a $r^{}$ under $p_0$ and based on the fact of $\alpha(n^{}) \leq \alpha$, the type I error rate is well controlled. Since the trends of alpha and power are the same (i.e., if alpha increases, power also increases), if alpha is close to the original value, the power will also be close to the original value. This means we can simultaneously maintain the original power or minimize power loss. That is, if we select a larger $r^{*}$ instead of the smallest one, we cannot minimize the power loss.
Firstly, let's introduce the usage of ATS_Design()
. Under-enrollment in the
first stage is more critical based on our understanding, so we will focus on this
scenario for illustration purposes. In the optimal Simon’s two-stage design
described in Example 1, the planned sample size for the $1^{st}$ and $2^{nd}$
stages are 14 and 30, respectively.
Suppose we currently have outcome data for only 11 patients in the $1^{st}$ stage
and assume the $2^{nd}$ stage sample size for evaluable patients remains as
planned, We can use ATS_Design()
to provide updated thresholds for the
interim analysis, without needing to wait for the number of evaluable patients
to reach 14. This means the input n1_star
is 11 and n_star
is 41 (=11+30
instead of the original 14+30 as the total sample size), as the original optimal
design specifies a sample size of 30 for the second stage.
ATS_Design(n1=14,n=44,n1_star=11,n_star=41,r1=3,r=14,p0=0.25,p1=0.45,alpha=0.1)
The updated design parameters, $(r_1^{},r^{})$, by ATS Simon design method are (2, 14). We also output $(n_1^{},n^{})$ and they are just actual sample sizes of stage 1 and the total sample size. The type I error rate of this updated design is 0.06, which is below the original type I error constraint of 0.1. As we know, if alpha decreases, power will also decrease. Therefore, the current power of 85.4% is lower than the original design's 90% as expected, but still close to the original power. Additionally, the probability of early termination is 0.455, which is close to the original design's 0.521.
Now, suppose the number of patients responding to the new treatment in the first stage is > 2. In that case, we will make a “Go” decision and recruit an additional 30 patients for the second stage of the study.
We now consider an under-enrollment scenario in the $2^{nd}$ stage, in addition to the $1^{st}$ stage's under-enrollment.
For example, if the actual sample size in the $2^{nd}$ stage is 28 (the planned
one is 30), indicating under-enrollment. Now the total sample size is
39, e.g., the input n_satr
= 39.
Using the updated design parameters with sample sizes at stages 1 and 2 of $(n_1^{},n^{})$ = (11, 39), and the other original design parameters in the following code, we can determine the updated design parameters and operating characteristics.
ATS_Design(n1=14,n=44,n1_star=11,n_star=39,r1=3,r=14,p0=0.25,p1=0.45,alpha=0.1)
Our updated design parameters $(r_1^{},r^{})$ are now (2, 13). We also output $(n_1^{},n^{})$ and they are just actual sample sizes of stage 1 and the total sample size. The updated type I error rate is 0.077, which is below the constraint of 0.1.
As another example, suppose the actual sample size in the $2^{nd}$ stage is now
31, indicating over-enrollment. Therefore, the input n_satr
would be 42.
ATS_Design(n1=14,n=44,n1_star=11,n_star=42,r1=3,r=14,p0=0.25,p1=0.45,alpha=0.1)
The updated design parameters $(r_1^{},r^{})$ are now (2, 14). We also output $(n_1^{},n^{})$ and they are just actual sample sizes of stage 1 and the total sample size.The type I error for our ATS Simon design is 0.071, which is below the constraint of 0.1.
Rather than offering the above ATS design by merely adjusting thresholds based on the realized sample size, we present another option that follows the original Simon's two-stage design algorithm: the Adaptive Threshold and Sample Size Simon (ATSS Simon) method. This approach extends Simon’s two-stage design by simultaneously adjusting both thresholds and sample size.
The overall strategy of this method is detailed as below:
$\bullet$ Scenario 1: When under-enrollment or over-enrollment occurs at the $1^{st}$ stage, we identify the design parameters $(r_{1}^{}, r^{}, n^{})$ based on the actual sample size $n_{1}^$ at the $1^{st}$ to satisfy the significance level $\alpha$ and power $1-\beta$: $$P\left(Y_{1} > r_1^, Y > r^ \mid n_{1}^{}, n^{}, p_{0}\right) \leq \alpha$$ $$P\left(Y_{1} > r_1^, Y > r^ \mid n_{1}^{}, n^{}, p_{1}\right) \geq 1-\beta$$ In addition, the design parameters $(r_{1}^{}, r^{}, n^{*})$ satisfies the same criteria as in the Optimal Simon's Two Stage: minimizing the average sample size under the null hypothesis.
$\bullet$ Scenario 2 ($n^{}$ changes again): When a Go decision has been made, the realized sample size $n^{}$ may be again different from $n^{}$. Further adjustment of the threshold at the $2^{nd}$ stage is needed. So, we update again this threshold $r^{}$ such that we identify a smallest integer $r^{}$ given the design parameters $(r_{1}^{}, n_{1}^{}, n^{})$ satisfying $$P\left(Y_{1} > r_1^*, Y > r^{} \mid n_{1}^{}, n^{}, p_{0}\right) \leq \alpha $$ Here, $n_1^{}$ denotes as the actual sample size at the $1^{st}$ stage; $n^{}$ denotes as the actual total sample size of the two stages after the interim analysis based on $n_1^{*}$; $n^{}$ denotes as the actual total sample size of the two stages if $n^{*}$ is again updated.
In the optimal Simon's two-stage design described in Example 1, the planned sample size for the $1^{st}$ and $2^{nd}$ stages are 14 and 30, respectively.
Suppose we currently have outcome data for only 11 patients in the $1^{st}$
stage and assume the $2^{nd}$ stage sample size remains as planned.
We can use ATSS_Design_Stage1()
to implement the introduced ATSS Simon algorithm.
Here, the parameter n1_satr
is 11.
Note: Different from the previous ATS examples, we now should be noted that it
is unnecessary to input n_star
as a parameter, since the current ATSS method
will compute an updated total sample size based on the sample size deviation of
the stage 1 from the original design.
ATSS_Design_Stage1(p0=0.25,p1=0.45,n1_star=11,alpha=0.1,beta=0.1)
The updated design parameters $(r_1^{},r^{},n_1^{},n^{})$ are (2, 15, 11, 47). The type I error for our redesign is 0.09, which is controlled well and < 0.1. The updated power for our redesigned study is 0.901, surpassing the constraint of 0.9. This increase is due to the ATSS Simon design method, which now provides a larger total sample size of 47, compared to 44 in the original design.
If more than 2 patients respond to the new treatment in the first stage of our
redesigned study, we will proceed to recruit 36 additional patients
(47 total minus the 11 from the $1^{st}$ stage). However, if now the actual sample
size in the second stage is 34, indicating under-enrollment, we can use the
function ATSS_Design_Stage2()
to update the design parameters based on the
realized total sample size. This adjustment means the parameter n_double_star
is now 45.
Note: In the following code, n1_star
and n_double_star
are fixed since
the trial is complete. Essentially, the ATSS algorithm only updates the threshold
for stage 2, r_star
. In this example, however, r_star
remains unchanged
compared to Example 3.1.
ATSS_Design_Stage2(p0=0.25,p1=0.45,r1_star=2,n1_star=11,n_double_star=45,alpha=0.1)
Our updated design parameters $(r_1^{},r^{},n_1^{},n^{})$ are (2, 15, 11, 45).
And the updated type I error rate is 0.066 smaller than the original one of 0.1
and the updated power is 0.878, both are due to the realized smaller total sample
size, though in this example, the updated r_star
is still 15.
Suppose more than 2 patients respond to the new treatment at the end of the
$1^{st}$ stage of our redesigned study. In that case, we will make a “Go”
decision to recruit 36 more patients into our study. However, if the actual
sample size in the $2^{nd}$ stage is 37, indicating over-enrollment, we can
use the function ATSS_Design_Stage2()
to update the design parameters
based on the actual total sample size. In the following code, this means the
parameter n_double_star
is now 48.
ATSS_Design_Stage2(p0=0.25,p1=0.45,r1_star=2,n1_star=11,n_double_star=48,alpha=0.1)
Our updated design parameters $(r_1^{},r^{},n_1^{},n^{})$ are (2, 16, 11, 48). And updated type I error rate is 0.061 and updated power is 0.884.
Note: Compared to Example 3.2, although the current total sample size is larger
(48 vs. 45), we expect to see an increase in power and type I error rate larger
than those in Example 3.2. However, while the power is as expected, the type I
error rate is not. This is because the optimally searched threshold at stage 2,
r_star
, is 16, compared to 15 in Example 3.2. Our algorithm dictates controlling
the type I error rate and could only identify 16 as the threshold.
## Our algorithm searching process result <- data.frame( r_star = round(c(2.0000000, 3.0000000, 4.0000000, 5.0000000, 6.0000000, 7.0000000, 8.0000000, 9.0000000, 10.0000000, 11.0000000, 12.0000000, 13.0000000, 14.0000000, 15.0000000, 16.0000000)), alpha = round(c(0.5447991, 0.5447929, 0.5447130, 0.5442052, 0.5421068, 0.5357601, 0.5207790, 0.4920445, 0.4460005, 0.3831060, 0.3087428, 0.2317242, 0.1611787, 0.1035884, 0.0614173), 3), power = round(c(0.9347765, 0.9347765, 0.9347765, 0.9347765, 0.9347763, 0.9347752, 0.9347684, 0.9347366, 0.9346115, 0.9341919, 0.9329744, 0.9298792, 0.9229204, 0.9089765, 0.8839142), 3) ) print(result)
Both the proposed ATS and ATSS Simon design methods can address under- and over-enrollment. The ATS algorithm updates the thresholds based on the actual sample sizes, maintaining control over the type I error rate $\alpha$ but potentially resulting in lower power than the original design during under-enrollment. In contrast, the ATSS algorithm updates both the thresholds and sample sizes, thereby controlling the $\alpha$ while maintaining power similar to the original design, though it may require a larger sample size.
Key design details of two-stage single-arm trials are frequently left unreported, and their statistical inference is often not conducted in a way that mitigates the bias introduced by interim analyses. This issue is particularly concerning given the increasing reliance on non-randomized trials, which now represent a significant portion of the evidence on treatment effectiveness in rare, biomarker-defined patient subgroups (Grayling, 2021). Motivated by this problem, we inherit some widely used post-trial inference methods for ATS and ATSS Simon Designs with Under- and Over-Enrollment.
When a multistage trial is ended, we also want to estimate the true response
probability ${\pi}{p}$ of the new therapy. The most commonly used estimator is
the sample response rate, i.e. the maximum likelihood estimator (MLE):
$$
\hat{\pi}{p} = \frac{S}{N}
$$
However, in multi-stage designs like Simon's Two-Stage design, we observe only
extreme cases by crossing the threshold in the first stage, and hence the MLE
is biased. In other words, the MLE is biased due to the sequential nature of
the trial. This is known as the optional sampling effect. Here, we adopt Jung's
method for the estimation of the binomial probability in multistage clinical
trials (Jung, 2004). Based on the Rao-Blackwell theorem, they derived the
uniformly minimum variance unbiased estimator (UMVUE) as the conditional
expectation of an unbiased estimator, which in this case is simply the maximum
likelihood estimator based only on the first stage data, given the sufficient
statistic. Let $M$ denote the stopping stage ($2^{nd}$ stage in our context) and
let $S=S_M$ denote the total number of responders accumulated up to the stopping
stage. For observation $(m,s)$, the UMVUE of the response rate $p$ is given by:
$$
\hat{\pi}{p} =
\begin{cases}
\frac{S}{n_1} & \text{if } m = 1 \
\frac{\sum{x_{1}=\left(r_{1}+1\right) \vee\left(S-n_{2}\right)}^{S \wedge n_{1}}
{n_1 \choose x_1}
{n_2-1 \choose S-x_1-1}}{
\sum_{x_{1}=\left(r_{1}+1\right) \vee\left(S-n_{2}\right)}^{S \wedge n_{1}}
{n_1 \choose x_1}
{n_2 \choose S-x_1}}
& \text{if } m = 2
\end{cases}
$$
At stage $m$, we may accrue slightly more (or possibly less) patients than
planned sample size as we introduced previously, especially in multicenter
trials. UMVUE also provides an unbiased estimator for the conditional expectation
by using all realized sample size.
A conventional approach for constructing confidence intervals is to use the Clopper-Pearson exact confidence interval, disregarding the group sequential nature of the trial (Clopper et al., 1934). $$P(Y \geq y \mid p_{L}) = \sum_{k=y}^{n}{n \choose k}p_{L}^{k}(1-p_{L})^{n-k}=\frac{\alpha}{2}$$ $$P(Y \leq y \mid p_{U}) = \sum_{k=y}^{n}{n \choose k}p_{U}^{k}(1-p_{U})^{n-k}=\frac{\alpha}{2}$$ We now focus on constructing confidence intervals by considering deviations in sample sizes from the planned ones in the $1^{st}$ and/or $2^{nd}$ stages. Jung (2004) proposed a method especially when the treatment successfully goes to the second stage. Let $M$ denote the stage at which a trial is terminated, and $S$ denote the number of responders at stage $M$. This method constructs confidence intervals based on the stochastic ordering of the distribution of $(M,S)$ with respect to the response rate $p$.
$$P(\hat{p_{u}}(M,S) \geq \hat{p_{u}}(m,s) \mid p_{L}) = \frac{\alpha}{2}$$ $$P(\hat{p_{u}}(M,S) \leq \hat{p_{u}}(m,s) \mid p_{U}) = \frac{\alpha}{2}$$ However, the Clopper-Pearson confidence interval is known to be conservative, with the actual confidence level being bounded below by $(1-\alpha)$. Jung’s method also inherits this conservatism (Porcher et al., 2012).
To correct for this conservative nature, Porcher extended the Jung's confidence
interval with a mid-$p$ approach (Porcher et al., 2012). This method is our
recommended approach, even though all the above methods have also been integrated
into the {UnplanSimon} R package.
$$P(\hat{p_{u}}(M,S) > \hat{p_{u}}(m,s) \mid p_{L}) + \frac{1}{2}P(P(\hat{p_{u}}(M,S) = \hat{p_{u}}(m,s) \mid p_{L}))
= \frac{\alpha}{2}$$
$$P(\hat{p_{u}}(M,S) < \hat{p_{u}}(m,s) \mid p_{U}) + \frac{1}{2}P(P(\hat{p_{u}}(M,S) = \hat{p_{u}}(m,s) \mid p_{U}))
= \frac{\alpha}{2}$$
Kunzmann et al. (2024) raised a question whether a frequentist inferential framework for two-stage designs has a consistent test decision among $p$-values, point estimates, and confidence intervals. In practice, however, the consistency in those test decisions is often presumed by the non-statistical readership of published trial results and ambiguous situation can be avoided by using a consistent framework.
So here, we used the $p$-value defined as the probability of obtaining more
extreme estimates toward $H_1$ than the observed one when $H_0$ is true based
on the UMVUE ordering (Jung, 2006). Hence, for testing $H_0:p=p_0$ against
$H_0:p=p_0 (p_0<p_1)$, the $p$-value for an estimate $\hat{p}(m,s)$ will be
given as
$$
p_s =
\begin{cases}
1-\sum_{(i,j):\hat{p_u}(i,j) < \hat{p_u}(m,s)}f_{p_0}(i,j) & \text{if } m = 1 \
\sum_{(i,j):\hat{p_u}(i,j)\geq \hat{p_u}(m,s)}f_{p_0}(i,j) & \text{if } m = 2
\end{cases}
$$
It can be rewritten as
$$
p_s =
\begin{cases}
P(Y_1 \geq s \mid p_0) & \text{if } m = 1 \
\sum_{y_1=r_1+1}^{n_1} P(Y_1 =y_1 \mid p_0) P(Y_2 \leq s-y_1 \mid p_0)
& \text{if } m = 2
\end{cases}
$$
Suppose we use the ATS Simon's design method to address the problem
of under-enrollment. The original Simon's Two-Stage design is $(r_1,r,n_1,n) = (3,14,14,44)$.
Considering the under-enrollment problem only at the $1^{st}$ stage like Example 2.1,
the new design is $(r_1^{},r^,n_1^{},n^{}) = (2,14,11,41)$. Especially, it
should be noted that the input alpha
here is the updated type I error
constraint $\alpha(n^{})$ if we used the ATS Simon's design. In Example 2.1, the
updated type I error constraint $\alpha(n^{})$ is 0.088. So it means the input
'alpha
here is 0.088. If the new treatment successfully enters the second stage
and 20 patients respond to it, this means the parameter m
is 2 and s
is 20
under this context.
We now compute the point estimate (UMVUE), confidence interval and $p$-value
based on the above setting. Note here, the option "CP"
means Clopper-Pearson
exact confidence interval,option "Jung"
means Jung’s confidence interval and
option "MIDp"
means mid-$p$ approach confidence interval.
SimonAnalysis(m=2, s=20, n1=11, n2=30, r1=2, r=14, alpha=0.088, quantile=c(0.025,0.975), CI_option = "CP", p0=0.25) SimonAnalysis(m=2, s=20, n1=11, n2=30, r1=2, r=14, alpha=0.088, quantile=c(0.025,0.975), CI_option = "Jung", p0=0.25) SimonAnalysis(m=2, s=20, n1=11, n2=30, r1=2, r=14, alpha=0.088, quantile=c(0.025,0.975), CI_option = "MIDp", p0=0.25)
Note: the computed UMVUEs and p values are the same since the above introduced
same UMVUE method and approach for computing p value applied to all the three
ways of computing the CIs. And the input alpha
here is the updated type I
error constraint $\alpha(n^{*})$ if we used the ATS Simon's design.
We can see the point estimate (UMVUE) for the response rate is 0.494; Clopper-Pearson exact confidence interval is (0.347, 0.630), Jung’s confidence interval is (0.329, 0.629), mid-$p$ approach confidence interval is (0.339, 0.641). Also, we see that $p$-value is 0.001 smaller than the updated type I error constraint $\alpha(n^{*})$ of 0.088. So we can reject the null hypothesis $H_0$: The true treatment response rate is less than or equal to some unacceptable level ($p \leq p_0=0.25$).
Suppose we use the ATSS Simon's design method to address the problem
of under-enrollment. The original Simon's Two-Stage design is $(r_1,r,n_1,n) = (3,14,14,44)$.
Considering the under-enrollment problem only at the $1^{st}$ stage like
Example 3.1, the new design is $(r_1^{},r^,n_1^{},n^{}) = (2,15,11,47)$.
If the new treatment successfully enters the second stage and 22 patients respond
to it, this means the parameter m
is 2 and s
is 22 under this context.
We now compute the point estimate (UMVUE), confidence interval and $p$-value
based on the above setting. Note here, the option "CP"
means Clopper-Pearson
exact confidence interval,option "Jung"
means Jung’s confidence interval and
option "MIDp"
means mid-$p$ approach confidence interval.
SimonAnalysis(m=2, s=22, n1=11, n2=36, r1=2, r=15, alpha=0.1, quantile=c(0.025,0.975), CI_option = "CP", p0=0.25) SimonAnalysis(m=2, s=22, n1=11, n2=36, r1=2, r=15, alpha=0.1, quantile=c(0.025,0.975), CI_option = "Jung", p0=0.25) SimonAnalysis(m=2, s=22, n1=11, n2=36, r1=2, r=15, alpha=0.1, quantile=c(0.025,0.975), CI_option = "MIDp", p0=0.25)
We can see the point estimate (UMVUE) for the response rate is 0.478; Clopper-Pearson exact confidence interval is (0.342, 0.597), Jung’s confidence interval is (0.322, 0.604), mid-$p$ approach confidence interval is (0.330, 0.615). Also, we see that $p$-value is 0.001 smaller than the type I error constraint $\alpha$ 0.01. So we can reject the null hypothesis $H_0$: The true treatment response rate is less than or equal to some unacceptable level ($p \leq p_0=0.25$).
Simon, R. (1989). Optimal two-stage designs for phase II clinical trials. Controlled clinical trials, 10(1), 1-10.
Ji, L., Whangbo, J., Levine, J. E., & Alonzo, T. A. (2022). Inefficiency of two-stage designs in phase II oncology clinical trials with high proportion of inevaluable patients. Contemporary clinical trials, 120, 106849.
Gordon Lan, K. K., & DeMets, D. L. (1983). Discrete sequential boundaries for clinical trials. Biometrika, 70(3), 659-663.
Grayling, M. J., & Mander, A. P. (2021). Two-stage single-arm trials are rarely analyzed effectively or reported adequately. JCO Precision Oncology, 5, 1813-1820. https://doi.org/10.1200/PO.21.00276
Jung, S. H., & Kim, K. M. (2004). On the estimation of the binomial probability in multistage clinical trials. Statistics in medicine, 23(6), 881-896. https://doi.org/10.1002/sim.1653
Clopper, C. J., & Pearson, E. S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika, 26(4), 404-413. https://doi.org/10.2307/2331986
Porcher, R., & Desseaux, K. (2012). What inference for two-stage phase II trials?. BMC medical research methodology, 12, 1-13. https://doi.org/10.1186/1471-2288-12-117
Kunzmann, K. (2024). Optimal Adaptive Designs for Early Phase II Trials in Clinical Oncology (Doctoral dissertation). https://archiv.ub.uni-heidelberg.de/volltextserver/34225/
Jung, S. H., Owzar, K., George, S. L., & Lee, T. (2006). P-value calculation for multistage phase II cancer clinical trials. Journal of Biopharmaceutical Statistics, 16(6), 765-775. https://doi.org/10.1080/10543400600825645
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.