knitr::opts_chunk$set(echo = TRUE)
sievetrend
The sievetrend
repository is available for download as an R
package and may be
downloaded directly from GitHub as follows.
# install sievetrend from GitHub devtools::install_github("benkeser/sievetrend") library(sievetrend) # also will require package survtmle if(!("survtmle" %in% row.names(installed.packages()))){ install.packages("survtmle") } library(survtmle)
Below, we include the code that was used to perform the simulation study and to analyze the RTS,S data for the manuscript.
Here, we demonstrate how to obtain results for a simulated data set.
# sample size n <- 1000 # set random seed set.seed(1234) # simulate data set dat <- makeData(n = n) # get the formula for the empirical estimate of the # censoring distribution needed for input to survtmle glm.ctime <- get.ctimeForm(trt = dat$trt, site = dat$adjustVars$site, ftime = dat$ftime, ftype = dat$ftype) # call survtmle to estimate cumulative incidence object <- survtmle(ftime = dat$ftime, ftype = dat$ftype, adjustVars = dat$adjustVars, trt = dat$trt, glm.trt = "1", glm.ftime = "trt*factor(site)", glm.ctime = glm.ctime, method = "mean", t0=6) # call trend_test to estimate projection trend <- trend_test(object) trend
A numerical approximation of the true value of $\beta_{0,n}$ may be obtained as follows.
# get covariance matrix estimate nabla_g <- sievetrend:::grad_g(object$est) Upsilon_n <- nabla_g %*% cov(Reduce(cbind, object$ic)) %*% t(nabla_g) # call getTruth function with this estimate several times # and average (to increase accuracy) beta_0n <- rowMeans(replicate(20, getTruth(Upsilon = Upsilon_n, n = 1e6)))[2] # compare estimate to truth c(est = trend$beta, truth = beta_0n)
The full code used to execute the simulation study is included in the simulation subdirectory. This includes the script cent.R
, which is batched to a slurm via
the script sce.sh
. The function makeIllustrationPlot
was used to produce Figure 1.
Due to existing privacy agreements, it is difficult to obtain access to the real RTS,S data. Nevertheless, a mock data set is distributed with the survtmle
package that will serve as illustration for the outputation methodology. See ?rtss
for further description of the data set.
The rtss
data set is a list of ten data sets each representing an outputed data set. They are formatted for analysis of a binary genetic mark, so we first replace the failure type column with a simulated genetic distance.
# replace the ftype column in each data set rtss_mod <- lapply(rtss, function(data){ # which were observed failures fail_idx <- which(data$ftype > 0) # number of observed failures fail_n <- length(fail_idx) # replace with a simulated version data$ftype[fail_idx] <- rbinom(fail_n, 4, plogis(data$vaccine)) + 1 # collapse site variable into a single column data$site <- as.numeric(1*(data$site1==1) + 2*(data$site2==1) + 3*(data$site3==1) + 4*(data$site4==1) + 5*(data$site5==1)) return(data) }) # check out new ftype distribution for first # outputed data set table(rtss_mod[[1]]$ftype)
Now we use survtmle
to estimate the cumulative incidence for each data set.
rslt <- lapply(rtss_mod, function(data){ glm.ctime <- get.ctimeForm(trt = data$trt, site = data$site, ftime = data$ftime, ftype = data$ftype) object <- survtmle(ftime = data$ftime, ftype = data$ftype, adjustVars = data[,"site",drop=FALSE], trt = data$vaccine, glm.trt = "1", glm.ftime = "trt*factor(site)", glm.ctime = glm.ctime, method = "mean", t0=6) return(object) })
We can use the getMO
function to obtain the averaged cumulative incidence
results and apply the trend_test on this scale.
# obtain averaged results mo_rslt <- getMO(rslt) # apply trend test mo_trend <- trend_test(mo_rslt) mo_trend
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.