knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4.5, fig.path = "figures/single-bin-" ) library(BayesianQDM)
We use a hypothetical Phase IIa proof-of-concept (PoC) trial in moderate-to-severe rheumatoid arthritis (RA). The investigational drug is compared with placebo in a 1:1 randomised controlled trial with $n_t = n_c = 12$ patients per group.
Endpoint: ACR20 response rate (proportion of patients achieving $\ge 20\%$ improvement in ACR criteria).
Clinically meaningful thresholds (posterior probability):
Null hypothesis threshold (predictive probability): $\theta_{\mathrm{NULL}} = 0.10$.
Decision thresholds: $\gamma_{\mathrm{go}} = 0.80$ (Go), $\gamma_{\mathrm{nogo}} = 0.20$ (NoGo).
Observed data (used in Sections 2–4): $y_t = 8$ responders out of $n_t = 12$ (treatment); $y_c = 3$ responders out of $n_c = 12$ (control).
Let $\pi_j$ denote the true response rate in group $j$ ($j = t$: treatment, $j = c$: control). We model the number of responders as
$$y_j \mid \pi_j \;\sim\; \mathrm{Binomial}(n_j,\, \pi_j),$$
and place independent Beta priors on each rate:
$$\pi_j \;\sim\; \mathrm{Beta}(a_j,\, b_j),$$
where $a_j > 0$ and $b_j > 0$ are the shape parameters.
The Jeffreys prior corresponds to $a_j = b_j = 0.5$, giving a weakly informative, symmetric prior on $(0, 1)$. The uniform prior corresponds to $a_j = b_j = 1$.
By conjugacy of the Beta-Binomial model, the posterior after observing $y_j$ responders among $n_j$ patients is
$$\pi_j \mid y_j \;\sim\; \mathrm{Beta}(a_j^,\, b_j^),$$
where the posterior shape parameters are
$$a_j^ = a_j + y_j, \qquad b_j^ = b_j + n_j - y_j.$$
The posterior mean and variance are
$$E[\pi_j \mid y_j] = \frac{a_j^}{a_j^ + b_j^}, \qquad \mathrm{Var}(\pi_j \mid y_j) = \frac{a_j^\, b_j^}{(a_j^ + b_j^)^2\,(a_j^ + b_j^* + 1)}.$$
The treatment effect is $\theta = \pi_t - \pi_c$. Since the two groups are independent, $\theta \mid y_t, y_c$ follows the distribution of the difference of two independent Beta random variables:
$$\pi_t \mid y_t \;\sim\; \mathrm{Beta}(a_t^,\, b_t^), \qquad \pi_c \mid y_c \;\sim\; \mathrm{Beta}(a_c^,\, b_c^).$$
There is no closed-form CDF for $\theta = \pi_t - \pi_c$, so the posterior probability
$$P(\theta > \theta_0 \mid y_t, y_c) = P(\pi_t - \pi_c > \theta_0 \mid y_t, y_c)$$
is evaluated by the convolution integral implemented in pbetadiff():
$$P(\pi_t - \pi_c > \theta_0) = \int_0^1 F_{\mathrm{Beta}(a_c^, b_c^)}!\left(x - \theta_0\right)\, f_{\mathrm{Beta}(a_t^, b_t^)}!(x)\; dx,$$
where $F_{\mathrm{Beta}(\alpha,\beta)}$ is the Beta CDF and
$f_{\mathrm{Beta}(\alpha,\beta)}$ is the Beta PDF.
This one-dimensional integral is evaluated by adaptive Gauss-Kronrod
quadrature via stats::integrate().
Let $\tilde{y}_j \mid \pi_j \sim \mathrm{Binomial}(m_j, \pi_j)$ be future trial counts with $m_j$ future patients. Integrating out $\pi_j$ over its Beta posterior gives the Beta-Binomial predictive distribution:
$$P(\tilde{y}_j = k \mid y_j) = \binom{m_j}{k} \frac{B(a_j^ + k,\; b_j^ + m_j - k)}{B(a_j^,\; b_j^)}, \quad k = 0, 1, \ldots, m_j,$$
where $B(\cdot, \cdot)$ is the Beta function.
The posterior predictive probability that the future treatment difference exceeds $\theta_{\mathrm{NULL}}$ is
$$P!\left(\frac{\tilde{y}_t}{m_t} - \frac{\tilde{y}_c}{m_c}
\theta_{\mathrm{NULL}} \;\Big|\; y_t, y_c\right) = \sum_{k_t=0}^{m_t} \sum_{k_c=0}^{m_c} \mathbf{1}!\left[\frac{k_t}{m_t} - \frac{k_c}{m_c} > \theta_{\mathrm{NULL}}\right]\, P(\tilde{y}_t = k_t \mid y_t)\, P(\tilde{y}_c = k_c \mid y_c).$$
This double sum over all $(m_t + 1)(m_c + 1)$ outcome combinations is
evaluated exactly by pbetabinomdiff().
Standard parallel-group RCT with observed data from both groups. The posterior shape parameters for group $j$ are
$$a_j^ = a_j + y_j, \qquad b_j^ = b_j + n_j - y_j,$$
and the posterior probability is $P(\pi_t - \pi_c > \theta_0 \mid y_t, y_c)$
computed via pbetadiff().
Posterior probability at TV and MAV:
# P(pi_t - pi_c > TV | data) p_tv <- pbayespostpred1bin( prob = 'posterior', design = 'controlled', theta0 = 0.20, n_t = 12, n_c = 12, y_t = 8, y_c = 3, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, m_t = NULL, m_c = NULL, z = NULL, ne_t = NULL, ne_c = NULL, ye_t = NULL, ye_c = NULL, alpha0e_t = NULL, alpha0e_c = NULL, lower.tail = FALSE ) # P(pi_t - pi_c > MAV | data) p_mav <- pbayespostpred1bin( prob = 'posterior', design = 'controlled', theta0 = 0.05, n_t = 12, n_c = 12, y_t = 8, y_c = 3, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, m_t = NULL, m_c = NULL, z = NULL, ne_t = NULL, ne_c = NULL, ye_t = NULL, ye_c = NULL, alpha0e_t = NULL, alpha0e_c = NULL, lower.tail = FALSE ) cat(sprintf("P(theta > TV | data) = %.4f -> Go criterion (>= %.2f): %s\n", p_tv, 0.80, ifelse(p_tv >= 0.80, "YES", "NO"))) cat(sprintf("P(theta <= MAV | data) = %.4f -> NoGo criterion (>= %.2f): %s\n", 1 - p_mav, 0.20, ifelse((1 - p_mav) >= 0.20, "YES", "NO"))) cat(sprintf("Decision: %s\n", ifelse(p_tv >= 0.80 & (1 - p_mav) < 0.20, "Go", ifelse(p_tv < 0.80 & (1 - p_mav) >= 0.20, "NoGo", ifelse(p_tv >= 0.80 & (1 - p_mav) >= 0.20, "Miss", "Gray")))))
Posterior predictive probability (future Phase III: $m_t = m_c = 40$):
p_pred <- pbayespostpred1bin( prob = 'predictive', design = 'controlled', theta0 = 0.10, n_t = 12, n_c = 12, y_t = 8, y_c = 3, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, m_t = 40, m_c = 40, z = NULL, ne_t = NULL, ne_c = NULL, ye_t = NULL, ye_c = NULL, alpha0e_t = NULL, alpha0e_c = NULL, lower.tail = FALSE ) cat(sprintf("Predictive probability (m_t = m_c = 40) = %.4f\n", p_pred))
Vectorised computation over all possible outcomes:
The function accepts vectors for y_t and y_c, enabling efficient
computation of the posterior probability for every possible outcome pair
$(y_t, y_c) \in {0,\ldots,n_t} \times {0,\ldots,n_c}$.
grid <- expand.grid(y_t = 0:12, y_c = 0:12) p_all <- pbayespostpred1bin( prob = 'posterior', design = 'controlled', theta0 = 0.20, n_t = 12, n_c = 12, y_t = grid$y_t, y_c = grid$y_c, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, m_t = NULL, m_c = NULL, z = NULL, ne_t = NULL, ne_c = NULL, ye_t = NULL, ye_c = NULL, alpha0e_t = NULL, alpha0e_c = NULL, lower.tail = FALSE ) # Show results for a selection of outcome pairs sel <- data.frame(y_t = grid$y_t, y_c = grid$y_c, P_gt_TV = round(p_all, 4)) head(sel[order(-sel$P_gt_TV), ], 10)
When no concurrent control group is enrolled, a hypothetical control responder count $z$ is specified from historical knowledge. The control posterior is then
$$\pi_c \mid z \;\sim\; \mathrm{Beta}(a_c + z,\; b_c + n_c - z),$$
where $n_c$ is the hypothetical control sample size and $z$ is the
assumed number of hypothetical responders ($0 \le z \le n_c$). Only the
treatment group data $y_t$ are observed; $y_c$ is set to NULL.
# Hypothetical control: z = 2 responders out of n_c = 12 p_unctrl <- pbayespostpred1bin( prob = 'posterior', design = 'uncontrolled', theta0 = 0.20, n_t = 12, n_c = 12, y_t = 8, y_c = NULL, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, m_t = NULL, m_c = NULL, z = 2L, ne_t = NULL, ne_c = NULL, ye_t = NULL, ye_c = NULL, alpha0e_t = NULL, alpha0e_c = NULL, lower.tail = FALSE ) cat(sprintf("P(theta > TV | data, uncontrolled, z=2) = %.4f\n", p_unctrl))
Effect of the hypothetical control count: varying $z$ shows the sensitivity to the assumed control rate.
z_seq <- 0:12 p_z <- sapply(z_seq, function(z) { pbayespostpred1bin( prob = 'posterior', design = 'uncontrolled', theta0 = 0.20, n_t = 12, n_c = 12, y_t = 8, y_c = NULL, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, m_t = NULL, m_c = NULL, z = z, ne_t = NULL, ne_c = NULL, ye_t = NULL, ye_c = NULL, alpha0e_t = NULL, alpha0e_c = NULL, lower.tail = FALSE ) }) data.frame(z = z_seq, P_gt_TV = round(p_z, 4))
Historical or external data are incorporated via a power prior with borrowing weight $\alpha_{0ej} \in (0, 1]$. The augmented prior shape parameters are
$$a_j^ = a_j + \alpha_{0ej}\, y_{ej}, \qquad b_j^ = b_j + \alpha_{0ej}\,(n_{ej} - y_{ej}),$$
where $n_{ej}$ and $y_{ej}$ are the external sample size and responder
count for group $j$
(corresponding to ne_t/ne_c, ye_t/ye_c, and alpha0e_t/alpha0e_c).
These serve as the prior for the current PoC data,
yielding a closed-form posterior:
$$\pi_j \mid y_j \;\sim\; \mathrm{Beta}!\left(a_j^ + y_j,\; b_j^ + n_j - y_j\right).$$
Setting $\alpha_{0ej} = 1$ corresponds to full borrowing (the external data are treated as if they came from the current trial); $\alpha_{0ej} \to 0$ recovers the result from the current data alone with the original prior.
Here, external data for both groups (external design): $n_{et} = 15$, $y_{et} = 5$, $n_{ec} = 15$, $y_{ec} = 4$, borrowing weights $\alpha_{0et} = \alpha_{0ec} = 0.5$:
p_ext <- pbayespostpred1bin( prob = 'posterior', design = 'external', theta0 = 0.20, n_t = 12, n_c = 12, y_t = 8, y_c = 3, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, m_t = NULL, m_c = NULL, z = NULL, ne_t = 15L, ne_c = 15L, ye_t = 5L, ye_c = 4L, alpha0e_t = 0.5, alpha0e_c = 0.5, lower.tail = FALSE ) cat(sprintf("P(theta > TV | data, external, alpha0e=0.5) = %.4f\n", p_ext))
Effect of the borrowing weight: varying $\alpha_{0ec}$ (control group only) with fixed $\alpha_{0et} = 0.5$:
# alpha0e must be in (0, 1]; avoid alpha0e = 0 ae_seq <- c(0.01, seq(0.1, 1.0, by = 0.1)) p_ae <- sapply(ae_seq, function(ae) { pbayespostpred1bin( prob = 'posterior', design = 'external', theta0 = 0.20, n_t = 12, n_c = 12, y_t = 8, y_c = 3, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, m_t = NULL, m_c = NULL, z = NULL, ne_t = 15L, ne_c = 15L, ye_t = 5L, ye_c = 4L, alpha0e_t = 0.5, alpha0e_c = ae, lower.tail = FALSE ) }) data.frame(alpha0e_c = ae_seq, P_gt_TV = round(p_ae, 4))
For given true response rates $(\pi_t, \pi_c)$, the operating characteristics are computed by exact enumeration over all $(n_t + 1)(n_c + 1)$ possible outcome pairs $(y_t, y_c) \in {0,\ldots,n_t} \times {0,\ldots,n_c}$:
$$\Pr(\mathrm{Go}) = \sum_{y_t=0}^{n_t} \sum_{y_c=0}^{n_c} \mathbf{1}!\left[g_{\mathrm{Go}}(y_t, y_c) \ge \gamma_{\mathrm{go}} \;\text{ and }\; g_{\mathrm{NoGo}}(y_t, y_c) < \gamma_{\mathrm{nogo}}\right]\, \binom{n_t}{y_t} \pi_t^{y_t}(1-\pi_t)^{n_t-y_t}\, \binom{n_c}{y_c} \pi_c^{y_c}(1-\pi_c)^{n_c-y_c},$$
where
$$g_{\mathrm{Go}}(y_t, y_c) = P(\pi_t - \pi_c > \theta_{\mathrm{TV}} \mid y_t, y_c),$$ $$g_{\mathrm{NoGo}}(y_t, y_c) = P(\pi_t - \pi_c \le \theta_{\mathrm{MAV}} \mid y_t, y_c),$$
and the decision outcomes are classified as:
oc_ctrl <- pbayesdecisionprob1bin( prob = 'posterior', design = 'controlled', theta_TV = 0.30, theta_MAV = 0.15, theta_NULL = NULL, gamma_go = 0.80, gamma_nogo = 0.20, pi_t = seq(0.10, 0.80, by = 0.05), pi_c = rep(0.10, length(seq(0.10, 0.80, by = 0.05))), n_t = 12, n_c = 12, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, z = NULL, m_t = NULL, m_c = NULL, ne_t = NULL, ne_c = NULL, ye_t = NULL, ye_c = NULL, alpha0e_t = NULL, alpha0e_c = NULL, error_if_Miss = TRUE, Gray_inc_Miss = FALSE ) print(oc_ctrl) plot(oc_ctrl, base_size = 20)
The same function supports design = 'uncontrolled' (with argument z),
design = 'external' (with power prior arguments ne_t, ne_c, ye_t,
ye_c, alpha0e_t, alpha0e_c), and prob = 'predictive' (with future
sample size arguments m_t, m_c and theta_NULL). The function signature
and output format are identical across all combinations.
The getgamma1bin() function finds the optimal pair
$(\gamma_{\mathrm{go}}^, \gamma_{\mathrm{nogo}}^)$ by grid search over
$\Gamma = {\gamma^{(1)}, \ldots, \gamma^{(K)}} \subset (0, 1)$ at a
null scenario $(\pi_t^{\mathrm{null}}, \pi_c^{\mathrm{null}})$.
A two-stage precompute-then-sweep strategy is used:
Stage 1 (Precompute): all posterior/predictive probabilities $g_{\mathrm{Go}}(y_t, y_c)$ and $g_{\mathrm{NoGo}}(y_t, y_c)$ are computed once over all $(n_t + 1)(n_c + 1)$ outcome pairs — independently of $\gamma$.
Stage 2 (Sweep): for each candidate $\gamma \in \Gamma$, $\Pr(\mathrm{Go} \mid \gamma)$ and $\Pr(\mathrm{NoGo} \mid \gamma)$ are obtained as weighted sums of indicator functions:
$$\Pr(\mathrm{Go} \mid \gamma) = \sum_{y_t, y_c} \mathbf{1}!\left[g_{\mathrm{Go}}(y_t, y_c) \ge \gamma\right]\, w(y_t, y_c),$$
$$\Pr(\mathrm{NoGo} \mid \gamma) = \sum_{y_t, y_c} \mathbf{1}!\left[g_{\mathrm{NoGo}}(y_t, y_c) \ge \gamma\right]\, w(y_t, y_c),$$
where $w(y_t, y_c) = \mathrm{Binom}(y_t; n_t, \pi_t)\,\mathrm{Binom}(y_c; n_c, \pi_c)$
are the binomial weights under the null scenario. No further calls to
pbayespostpred1bin() are required at this stage.
The sweep is vectorised using outer(), making the full grid search
computationally negligible once Stage 1 is complete.
The optimal $\gamma_{\mathrm{go}}^$ is the smallest value in $\Gamma$ such that $\Pr(\mathrm{Go} \mid \gamma_{\mathrm{go}}^)$ satisfies the criterion against a target (e.g., $\Pr(\mathrm{Go}) < 0.05$). Analogously for $\gamma_{\mathrm{nogo}}^*$.
res_gamma <- getgamma1bin( prob = 'posterior', design = 'controlled', theta_TV = 0.30, theta_MAV = 0.15, theta_NULL = NULL, pi_t_go = 0.10, pi_c_go = 0.10, pi_t_nogo = 0.30, pi_c_nogo = 0.10, target_go = 0.05, target_nogo = 0.20, n_t = 12L, n_c = 12L, a_t = 0.5, b_t = 0.5, a_c = 0.5, b_c = 0.5, z = NULL, m_t = NULL, m_c = NULL, ne_t = NULL, ne_c = NULL, ye_t = NULL, ye_c = NULL, alpha0e_t = NULL, alpha0e_c = NULL, gamma_grid = seq(0.01, 0.99, by = 0.01) ) plot(res_gamma, base_size = 20)
The same function supports design = 'uncontrolled' (with pi_t_go,
pi_t_nogo, and z), design = 'external' (with power prior arguments),
and prob = 'predictive' (with theta_NULL, m_t, and m_c). The
calibration plot and the returned gamma_go/gamma_nogo values are
available for all combinations.
This vignette covered single binary endpoint analysis in BayesianQDM:
pbetadiff() — no closed-form CDF exists for the Beta difference.pbetabinomdiff().getgamma1bin() using outer() for a fully vectorised gamma sweep.See vignette("single-continuous") for the analogous continuous endpoint
analysis.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.