In Arhit-Chakrabarti/logisticLAGO: Learn-As-You-Go study with a single constraint and binary response

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Overview

In this vignette, I will show how this package can be used to study the design aspects or estimate 'optimal' intervention packages through simulations related to the LAGO design. I will briefly introduce the LAGO study, its goals and a brief section on the LAGO methodology, before moving on to the section on how to use this package.

Introduction

The U.S. Food and Drug Administration defines an adaptive design as "$\dots$ a clinical study design that allows for prospectively planned modifications based on accumulating study data without undermining the study’s integrity and validity" (FDA, 2016). Adaptive designs have been developed, studied and used in clinical trials for over a decade now. The "Learn-As-You-Go" (LAGO) (Nevo et. al. Analysis of "Learn-As-You-Go" (LAGO) Studies", The Annals of Statistics, 2021, 49(2):793-819.) design is a novel design, motivated by large scale public health intervention trials, where participants receive a complex multi-component intervention package which may be for e.g. a treatment, a device, a new way to organize care or a combination thereof. The study is conducted in multiple stages. One of the goals of the study is to prospectively modify the intervention package at each stage to develop the optimal intervention package to be administered at the next stage. The data collected at each step is analyzed to reassess and thereby modify the intervention package to be rolled out at the next stage and thus, the LAGO design is an adaptive design. The first goal of a LAGO study is to identify the optimal intervention package while minimizing the cost of the intervention package subject to the probability of a desired binary outcome being above a given threshold. The second goal of the study is to assess the effectiveness of the intervention package on the desired response.

Methodology

In this section, I present briefly the theoretical backdrop of the LAGO design that is used to develop the proposed R-package. The reader is referred to the original paper for a more comprehensive guide to the methodology of the LAGO design.

The multivariate intervention package consists of $p$ components. Let $X$ be the support of the intervention, that is, all possible intervention values. Let there be $\mathbf{K}$ stages in the study and at each stage $k$, a version of the intervention package is implemented in each of $J_k$ centers. Let $n_{jk}$ denote the sample size in the $j$-th center at stage $k$. For stage 1, an initial $\mathbf{x}^{(1)}$ is chosen by the investigator based on their best judgement. The actual intervention, denoted by $\mathbf{A}$ received by any participant at any stage differs from the recommended intervention due to local constraints or other preferences. It is assumed that the probability of success for a single unit i in a center j under intervention $\mathbf{A}=\mathbf{a}j$, $$p{a_j}(\beta) = Pr(Y_{ij} = 1| \mathbf{A} = \mathbf{a}j, \mathbf{X} = \mathbf{x}_j; \beta)$$ does not depend on the recommended intervention $\mathbf{x}_j$ , except through the actual intervention $\mathbf{a}_j$, and follows a logistic regression model $$logit(p{a_j}(\beta)) = \beta_0 + \beta_1 ^T \mathbf{a}_j$$ where $\beta^T = (\beta_0, \beta_1 ^T)$ is a vector of unknown parameters such that $\beta_1$ describes the effects of the p intervention package components. It is further assumed that in each stage, conditionally on all $\mathbf{a}_j$, outcomes are independent within and between centers.

A main goal of the LAGO design is to identify the optimal intervention package. Let $\tilde{p}$ be a pre-specified outcome probability goal and $C(\mathbf{x})$ be a known cost function. If $\beta$ were known, an optimal intervention for a center could be the solution to the center-specific optimization problem $$min_{\mathbf{x}j} C(\mathbf{x}_j) \ \ \ \text{subject to}\ \ p{x_j}(\beta) \geq \tilde{p} \ \ \& \ \ \mathbf{x}_j \in \mathbf{X}$$ However, as $\beta$ is almost always unknown, they are estimated from the data accumulated till any given stage and are used to solve the above stated optimization problem with the estimated $\hat{\beta}$ to obtain the estimated optimal intervention package. At any stage, since $\hat{\beta}$ depends on the data from the previous stage, and hence the estimated optimal intervention package depends on the data from previous stages and is hence random. The paper further develops the theoretical development of the optimization problem under the stated issue which I will not discuss here. For the problem of testing the effectiveness of the intervention package it has been shown that under some assumptions and regularity conditions, the estimated $\hat{\beta}$ is consistent and is asymptotically normally distributed. Thus the standard use of asymptotic Wald test for "no-intervention" effect is still valid under the LAGO design.

Implementation

library(logisticLAGO)

The first function in this package is the expit function which calculates the success probabilities for the logistic regression model for a given vector of covariates $x$ and a vector of parameter values $\beta$. The success probability is calculated as $\frac{1}{(1 + e^{-\beta'x})}$. If the lower and upper limits of the components of the intervention package are given by $x_l$ and $x_u$ then the success probabilities for a true $\beta_{*}$ are calculated as

beta_true <- c(log(0.05), log(1.2), log(1.1), log(1.3))
x.l <- c(1, 10, 2) # Lower limits for X
x.u <- c(4, 15, 15) # Upper limits for X
prob_lower <- expit(beta = beta_true, x = x.l)
prob_upper <- expit(beta = beta_true, x = x.u)

knitr::kable(matrix(c(prob_lower, prob_upper), nrow = 1), format = "html", digits = 3, caption = "Success probabilities",
             col.names = c("Outcome goal at lower limit", "Outcome goal at upper limit"))

If the desired outcome goal $\tilde{p}$ lies between the the probabilities i.e. the probabilities corresponding to $x_l$ and $x_u$, only then it is reasonable to expect that the LAGO optimization would yield an optimal intervention which attains the desired success goal, while minimizing the cost.

The logisticLAGO package can be used in two different settings, namely in a situation where a LAGO trial is ongoing and with the data being collected at any stage, the estimated optimal intervention package to be implemented in the next stage is desired and also in a situation before the starting of a LAGO study, where an idea about the optimal intervention package is desired. The two situations are considered separately over here.

In an ongoing LAGO study

At any stage of the study, once the data is available on the actual intervention received by the participants and their binary responses, they maybe used to estimate the parameters $\hat{\beta}$ using the accumulated data. The estimates $\hat{\beta}$, the intervention package rolled out at that stage $x$, along with information on their unit costs, upper and lower limits for the components of the intervention package may be used to estimate the optimal intervention package to be used in the next stage.

beta_est <- c( - 2.1350,  0.01899,  0.1989, 0.1273)
x.l <- c(1, 10, 2) # Lower limits for X
x.u <- c(4, 15, 15) # Upper limits for X
x.start <- c(2.5, 12.5, 7)

cost_lin = c(1, 8, 2.5)
p_bar = 0.85 # Defining the desired outcome goal
## Running the LAGO optimization algorithm
opt_lago = opt_int(cost = cost_lin, beta = beta_est, lower = x.l,
                   upper = x.u, pstar = p_bar, starting.value = x.start)

The results from the optimization are summarized in the table below, which gives the estimated optimal intervention package and the success probabilities under the estimated optimal intervention package.

opt.x <- matrix(c(opt_lago$Optimum_Intervention, opt_lago$Obtained_p), nrow = 1)
colnames(opt.x) <- c("$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$")
knitr::kable(opt.x, format = "html", digits = 3, caption = "Table 1")

Given the data, if, however, the desired outcome goal cannot be met, the LAGO optimization returns the estimated optimal intervention package which minimizes the implementation cost, while trying to maximize the probability of observing success at the next stage with the estimated optimal intervention package.

beta_est <- c( - 3.7350,  0.01899,  0.1989, 0.1273)
x.l <- c(1, 10, 2) # Lower limits for X
x.u <- c(4, 15, 15) # Upper limits for X

Note that with the given estimated $\hat{\beta}$, the desired outcome goal $\tilde{p} = 0.85$ cannot be met, as can be seen from the table below:

prob_lower <- expit(beta = beta_est, x = x.l)
prob_upper <- expit(beta = beta_est, x = x.u)
# To be used in the table below
x.start <- c(2.5, 12.5, 7)
cost_lin = c(1, 8, 2.5)
p_bar = 0.85 # Defining the desired outcome goal

knitr::kable(matrix(c(prob_lower, prob_upper), nrow = 1), format = "html", digits = 3, caption = "Success probabilities",
             col.names = c("Outcome goal at lower limit", "Outcome goal at upper limit"))

## Running the LAGO optimization algorithm
opt_lago1 = opt_int(cost = cost_lin, beta = beta_est, lower = x.l,
                   upper = x.u, pstar = p_bar, starting.value = x.start)

opt.x <- matrix(c(opt_lago1$Optimum_Intervention, opt_lago1$Obtained_p), nrow = 1)
colnames(opt.x) <- c("$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$")
knitr::kable(opt.x, format = "html", digits = 3, caption = "Table 2")

Before starting a LAGO study

The main purpose of this package is to consider the situation before starting of a LAGO study, wherein an investigator may be interested to know how the study would progress given the parameters of the trial. These parameters maybe obtained from historical or concurrent trials or maybe based on the best judgment of a subject matter expert. This package considers several LAGO designs including a single center design and multi-center design.

Single center LAGO design

The simplest case is when the trial is designed to be conducted in a single center or location and is conducted over multiple stages. The number of stages (nstages) in the LAGO design, sample size per stage (sample.size), the unit costs for the intervention package components (cost.vec), the vector of minimum (lower) and maximum (upper) values of the components of the intervention package, desired outcome goal (prob), the best guess intervention effect (beta.true) and the initial value of intervention package (x0) along with the expected variation in rolling out the intervention package (icc) are used to simulate the estimated intervention package over the different stages, the estimated outcome goal and the estimated power of the test of "no-intervention" effect at the end of the study.

x.init = c(2.5, 12.5, 7) # Initial value interventions
x.l = c(1, 10, 2) # Lower limits for X
x.u = c(4, 15, 15) # Upper limits for X
n = 50 # Sample size
K = 3 # Number of stages
cost_lin = c(1, 8, 2.5) # Costs
p_bar = 0.9 # Desired outcome goal
# True/best guess beta values
beta = c(log(0.05), log(1.1), log(1.35), log(1.2))

sim_sc <- sc_lago(x0 = x.init, lower = x.l, upper = x.u,
               nstages = K, beta.true = beta,
               sample.size = n, icc = 0.1,
               cost.vec = cost_lin, prob = p_bar,
               B = 100, intercept = TRUE)

The stage wise results from the optimization are as below

results1 <- data.frame(sim_sc$xopt, sim_sc$p.opt.hat, c(sim_sc$power, "", ""))
results1 <- cbind(c("Stage 1", "Stage 2", "Stage 3"), results1)
rownames(results1) <- NULL
colnames(results1) <- c("Stage", "$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$", "Power")
knitr::kable(results1, format = "html", digits = 3, caption = "Table 3")

Multi-center LAGO design

The multi-center LAGO design extends the idea of the single center LAGO study to the case where the study is conducted in multiple centers or locations and pools the data together at each stage before proceeding with the optimization.

Multi-center LAGO with equal number of centers per stage and equal sample size in each center

This is is a special case of multi-center LAGO with equal number of centers and equal sample size at each center at each stage of the design. The mc_lago function in this package can be used to simulate the data for this design and perform the optimization. Apart from the information as required by the single center design, the number of centers per stage (centers) and sample size per center per stage (sample.size) is used to simulate the optimal intervention package to be rolled out in each of the centers in the next stage such that the goals of the study are met. As before, estimated power for the test of "no-intervention" effect at the end of the study is also provided through simulations.

x.init = c(2.5, 12.5, 7) # Initial value interventions
x.l = c(1, 10, 2) # Lower limits for X
x.u = c(4, 15, 15) # Upper limits for X
njk = 20 # Sample size per center per stage
K = 3 # Number of stages
J = 3 # Number of centers per stage
cost_lin = c(1, 8, 2.5) # Costs
p_bar = 0.9 # Desired outcome goal
# True/best guess beta values
beta = c(log(0.05), log(1.1), log(1.35), log(1.2))
# Running the Multi-center LAGO simulation study
sim_mc <- mc_lago(x0 = x.init, lower = x.l, upper = x.u,
                   beta.true = beta, nstages = K,
                   centers = J, sample.size = njk,
                   icc = 0.1, prob = p_bar,
                   cost.vec = cost_lin)

The stage wise results from the multi-center LAGO optimization are as below

results2 <- data.frame(sim_mc$xopt, sim_mc$p.opt.hat, c(sim_mc$power, "", ""))
results2 <- cbind(c("Stage 1", "Stage 2", "Stage 3"), results2)
rownames(results2) <- NULL
colnames(results2) <- c("Stage", "$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$", "Power")
knitr::kable(results2, format = "html", digits = 3, caption = "Table 4")

Multi-center LAGO with unequal number of centers per stage and equal sample size in each center

In any trial, it is intuitive to conduct the study in small number of centers at the beginning i.e. at the pilot stage and progressively increase the number of centers as the trial progresses from the pilot stage. The mc_lago_uc function considers different number of centers in each stage. However every center at any stage have equal samples. The function arguments are the same as the mc_lago function expect, that the argument centers is a vector of same dimension as nstages denoting the number of centers in each stage of the design. This function also differentiates between center and within center variations while considering simulations.

J_uc = c(3, 5, 10) # Number of centers per stage
# Running the Multi-center LAGO simulation study
sim_mc_uc <- mc_lago_uc(x0 = x.init, lower = x.l,
              upper = x.u, nstages = K, centers = J_uc,
              sample.size = njk, cost.vec = cost_lin,
              prob = p_bar, beta.true = beta,
              icc = 0.1, bcc = 0.15)

The results of which are shown below

results3 <- data.frame(sim_mc_uc$xopt, sim_mc_uc$p.opt.hat, c(round(sim_mc_uc$power, 2), "", ""))
results3 <- cbind(c("Stage 1", "Stage 2", "Stage 3"), results3)
rownames(results3) <- NULL
colnames(results3) <- c("Stage","$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$", "Power")
knitr::kable(results3, format = "html", digits = 3, caption = "Table 5")

Multi-center LAGO with unequal number of centers per stage and unequal sample size in each center

This is the general setup of any multi-center LAGO study, wherein apart from unequal number of centers at each stage, an unequal number of samples are considered at each center of a stage. At the beginning of a trial, apart from less number of centers, less number of participants are allocated to the trial and they are increased as the trail proceeds through the different stages and more data on the optimal intervention and response trend is available. The mc_lago_uc.us function considers different number of centers and sample size at each stage.

This function requires a list of sample sizes at each center of every stage.

J_uc = c(3, 5, 10) # Number of centers per stage
sample_size = list() # Initialize sample size per center at each stage
sample_size[[1]] = 20 # Sample size per center at stage 1 is 20
sample_size[[2]] = 25 # Sample size per center at stage 2 is 25
sample_size[[3]] = 30 # Sample size per center at stage 3 is 30
# Running the Multi-center LAGO simulation study
sim_mc_uc.us <- mc_lago_uc.us(x0 = x.init, lower = x.l,
                 upper = x.u, nstages = K, centers = J_uc,
                 sample.size = sample_size, cost.vec = cost_lin,
                 prob = p_bar, beta.true = beta,
                 icc = 0.1, bcc = 0.15)

The results are summarized below.

results4 <- data.frame(sim_mc_uc.us$xopt, sim_mc_uc.us$p.opt.hat, c(round(sim_mc_uc.us$power, 2), "", ""))
results4 <- cbind(c("Stage 1", "Stage 2", "Stage 3"), results4)
rownames(results4) <- NULL
colnames(results4) <- c("Stage","$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$", "Power")
knitr::kable(results4, format = "html", digits = 3, caption = "Table 6")

As a comparison of how the three methods compare, the results from the three different multi-center LAGO designs are summarized and for comparison they are based on the same total sample size. All other parameters remain the same as before except number of centers per stage and the sample size per center.

For the multi-center design with equal number of centers and equal sample size per center, we take sample size of 50 in each of the 3 centers per stage.
For the multi-center design with unequal number of centers in each stage with equal sample size per center, we take sample size of 25. The number of centers are
3 in stage 1
5 in stage 2
10 in stage 3
For the multi-center design with both unequal number of centers in each stage and unequal sample size per center, we take the number of centers to be the same as above, but the sample size in each of the centers in the three stages as
15 in stage 1
21 in stage 2
30 in stage 3

njk = 50
sim_mc <- mc_lago(x0 = x.init, lower = x.l, upper = x.u,
                   beta.true = beta, nstages = K,
                   centers = J, sample.size = njk,
                   icc = 0.1, prob = p_bar,
                   cost.vec = cost_lin)

njk = 25
J_uc = c(3, 5, 10) # Number of centers per stage
# Running the Multi-center LAGO simulation study
sim_mc_uc <- mc_lago_uc(x0 = x.init, lower = x.l,
              upper = x.u, nstages = K, centers = J_uc,
              sample.size = njk, cost.vec = cost_lin,
              prob = p_bar, beta.true = beta,
              icc = 0.1, bcc = 0.15)

J_uc = c(3, 5, 10)
sample_size = list() # Initialize sample size per center at each stage
sample_size[[1]] = 15 # Sample size per center at stage 1 is 20
sample_size[[2]] = 21 # Sample size per center at stage 2 is 25
sample_size[[3]] = 30 # Sample size per center at stage 3 is 30
# Running the Multi-center LAGO simulation study
sim_mc_uc.us <- mc_lago_uc.us(x0 = x.init, lower = x.l,
                 upper = x.u, nstages = K, centers = J_uc,
                 sample.size = sample_size, cost.vec = cost_lin,
                 prob = p_bar, beta.true = beta,
                 icc = 0.1, bcc = 0.15)

The results are summarized as below:

results5 <- data.frame(sim_mc$xopt, sim_mc$p.opt.hat, c(sim_mc$power, "", ""))
colnames(results5) <- c("$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$", "Power")
results6 <- data.frame(sim_mc_uc$xopt, sim_mc_uc$p.opt.hat, c(round(sim_mc_uc$power, 2), "", ""))
colnames(results6) <- c("$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$", "Power")
results7 <- data.frame(sim_mc_uc.us$xopt, sim_mc_uc.us$p.opt.hat, c(round(sim_mc_uc.us$power, 2), "", ""))
colnames(results7) <- c("$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$", "Power")
result = rbind(results5, results6, results7)
rownames(result) = NULL
result_full = cbind(c("Equal number of centers and samples", "", "", "Only uequal number of centers", "", "", "Unequal number of centers and samples", "", ""),
                    rep(c("Stage 1", "Stage 2", "Stage 3"), 3), 
                    result)
colnames(result_full) = c("Case"," Stage", "$\\hat{x}_1$", "$\\hat{x}_2$", "$\\hat{x}_3$", "$\\hat{p}$", "Power")

knitr::kable(result_full, format = "html", digits = 3, caption = "Table of comparisons for the different multi-center LAGO designs")

Details

For more information on logisticLAGO Package, please access the package documentations. Please feel free to contact the author.

Arhit-Chakrabarti/logisticLAGO documentation built on Dec. 17, 2021, 9:43 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Arhit-Chakrabarti/logisticLAGO
Learn-As-You-Go study with a single constraint and binary response

In Arhit-Chakrabarti/logisticLAGO: Learn-As-You-Go study with a single constraint and binary response

Overview

Introduction

Methodology

Implementation

In an ongoing LAGO study

Before starting a LAGO study

Single center LAGO design

Multi-center LAGO design

Multi-center LAGO with equal number of centers per stage and equal sample size in each center

Multi-center LAGO with unequal number of centers per stage and equal sample size in each center

Multi-center LAGO with unequal number of centers per stage and unequal sample size in each center

Details

R Package Documentation

Browse R Packages

We want your feedback!

Arhit-Chakrabarti/logisticLAGO Learn-As-You-Go study with a single constraint and binary response

In Arhit-Chakrabarti/logisticLAGO: Learn-As-You-Go study with a single constraint and binary response

Overview

Introduction

Methodology

Implementation

In an ongoing LAGO study

Before starting a LAGO study

Single center LAGO design

Multi-center LAGO design

Multi-center LAGO with equal number of centers per stage and equal sample size in each center

Multi-center LAGO with unequal number of centers per stage and equal sample size in each center

Multi-center LAGO with unequal number of centers per stage and unequal sample size in each center

Details

R Package Documentation

Browse R Packages

We want your feedback!

Arhit-Chakrabarti/logisticLAGO
Learn-As-You-Go study with a single constraint and binary response