#' Expected lower and expected upper limit of the confidence intervals of variant and vaccine-specific efficacy based on cumulative-risk ratio.
#' @description
#' The function `get_cohort_expectedCI_VE_crr` simulates confindence intervals for variant and vaccine-specific efficacy (VE) for a given sample size.
#' The efficacy is defined as `VE = 1 - cumulative-risk ratio`, and the function returns expected lower and expected upper confidence interval limit
#' for both absolute and relative VE.
#'
#' @param anticipated_VE_for_each_brand_and_variant a matrix of vaccine efficacy of each vaccine (row) against each variant (column). Each value must be a real number between 0 and 1.
#' @param brand_proportions_in_vaccinated a vector denoting the proportion in which vaccines are given in the vaccinated subjects of the study cohort. Each value of this vector must be a real number between 0 and 1 and the sum of the values of this vector must be equal to 1.
#' @param overall_vaccine_coverage the proportion of the study cohort that will be vaccinated. It should be a real number between 0 and 1.
#' @param proportion_variants_in_unvaccinated_cases a vector of the proportions in which each variant is expected to be present in the unvaccinated and infected subjects in the study cohort. Each value of this vector must be a real number between 0 and 1 and the sum of the values of this vector must be equal to 1.
#' @param overall_attack_rate_in_unvaccinated the proportion of the study cohort that is expected to infected over the study period. It should be a real number between 0 and 1.
#' @param calculate_relative_VE a logical indicating if calculations should also be done for relative vaccine efficacy (default `TRUE`).
#' @param alpha controls the width `[100*alpha/2, 100*(1 - alpha/2)]%` of the confidence interval. It is a numeric that must take a value between 0 and 1.
#' @param confounder_adjustment_Rsquared we use this parameter to adjust the calculations for potential confounders using the methodology proposed by Hsieh and Lavori (2000). It represents the amount of variance (R^2) explained in a regression model where vaccination status is the outcome and confounders of interest are predictors. It is a numeric that must take a value between 0 (no adjustment for confounders) and 1.
#' @param prob_missing_data to adjust the calculations for non-informative and random subject loss to follow-up/dropout. it should take a numeric value between 0 and 1.
#' @param total_subjects a vector of study cohort size for which calculations should be done.
#' @param nsims total number of Monte Carlo simulations conducted.
#'
#' @details
#' In this function efficacy is defined as `VE = 1 - cumulative-risk ratio`, where 'cumulative-risk ratio' is
#' the ratio of cumulative-risk of being a case of a particular variant/variant among the groups being compared.
#' When the groups being compared are a particular vaccine versus placebo then we call the VE
#' as the absolute VE of the vaccine. For `M` vaccines there are `M` absolute VE, one each for the `M` vaccines.
#' When the groups being compared are a particular vaccine versus another vaccine then we call the VE
#' as the relative VE of the vaccines, for a particular variant. For `M` vaccines and `I` variants there are `I x 2 x utils::combn(M, 2)`
#' permutations of relative VE of two vaccines against the same variant.
#'
#' We first transform the user inputs for `I` variants and `M` vaccines into a `(I + 1) x (M + 1)` cross table of
#' cumulative-risks of being a case or a control over the study period. The overall sum of all cumulative-risks,
#' i.e., all cells, of this table is 1. The first row of our cumulative-risk table contain cumulative-risk of being a control.
#' The first column corresponds to subjects who are unvaccinated.
#' Thus, the cell `{1,1}` contains the probability (cumulative-risk) that over the study period a subject will be a control and unvaccinated.
#' The remaining `Ì` rows correspond to subjects who are cases of a particular variant/variant of the pathogen,
#' and the remaining `M` columns correspond to subjects who are vaccinated with a particular vaccine.
#'
#' The next step is to simulate the data. To speed up our computations we sample an `(I + 1) x (M + 1)`
#' cross table of data from a multinomial distribution with probabilities taken from our cumulative-risk table.
#' The total subjects sampled in the cross table are are `total_subjects * (1 - prob_missing_data)`.
#' We then estimate the absolute and relative VE of each vaccine using the cumulative-risks based on the sampled data.
#' The confidence intervals with widths `[100*alpha/2, 100*(1 - alpha/2)]%` are obtained using normal approximation
#' to the distribution of log of cumulative-risk ratio (Morris and Gardner, 1988).
#' To adjust for confounders, the standard-error used in the confidence interval is rescaled to `SE/(1 - confounder_adjustment_Rsquared)` (Hsieh and Lavori, 2000)
#' We repeat this procedure `nsims` times, and in each such simulation we obtain `nsims` confidence intervals.
#'
#' To conduct simulations faster all the calculations are done without using for loops.
#' Instead we use a three-dimensional R arrays, with one-dimension for `nsims`,
#' another for the sample size vector `total_subjects`,
#' and another for a vector containing the flattened cross table of simulated data on cases and controls.
#'
#' @return A data frame consisting of the input parameters, absolute and relative VE combinations,
#' and the expected lower and expected upper width of the confidence intervals for each absolute and relative VE combination.
#'
#' @examples As an example we recommend running the function without passing any parameter to it.
#' The default scenario is for three vaccines and three pathogen variants.
#'
#' @references
#' 1. Hsieh, F. Y., & Lavori, P. W. (2000). Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates. Controlled clinical trials, 21(6), 552-560.
#' 2. Morris, J. A., & Gardner, M. J. (1988). Statistics in medicine: Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates. British medical journal (Clinical research ed.), 296(6632), 1313.
#' @importFrom utils combn
#' @export
get_cohort_expectedCI_VE_crr = function(anticipated_VE_for_each_brand_and_variant=
matrix(data=c(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 1), nrow = 3, ncol = 3, byrow = T,
dimnames = list(paste0('brand', 1:3), paste0('variant', 1:3))),
brand_proportions_in_vaccinated =
c('brand1'=0.3, 'brand2'=0.5, 'brand3'=0.2),
overall_vaccine_coverage=0.3,
proportion_variants_in_unvaccinated_cases = c('variant1'=0.6, 'variant2'=0.3, 'variant3'=0.1),
overall_attack_rate_in_unvaccinated = 0.1,
calculate_relative_VE = T,
alpha=0.05,
confounder_adjustment_Rsquared = 0,
prob_missing_data = 0.1,
total_subjects=seq(1000, 10000, 25),
nsims = 500){
check_input(anticipated_VE_for_each_brand_and_variant, brand_proportions_in_vaccinated, proportion_variants_in_unvaccinated_cases)
total_vaccine_brands = length(brand_proportions_in_vaccinated)
total_case_variants = length(proportion_variants_in_unvaccinated_cases)
relative_VE_combn = get_vaccine_comparison_combinations(total_vaccine_brands, total_case_variants, calculate_relative_VE)
prob_tables = get_cohort_full_table_crr(anticipated_VE_for_each_brand_and_variant, overall_vaccine_coverage, overall_attack_rate_in_unvaccinated,
proportion_variants_in_unvaccinated_cases, brand_proportions_in_vaccinated)
flat_full_table = c(prob_tables$full_table)
flat_conditional_table = c(prob_tables$conditional_table)
total_total_subject_settings = length(total_subjects)
missing_data_adjusted_total_subjects = round(total_subjects * (1-prob_missing_data))
cell_counts_sims = array(data = do.call('cbind', lapply(missing_data_adjusted_total_subjects, rmultinom, n=nsims, prob=flat_full_table)),
dim = c((1+total_vaccine_brands)*(1+total_case_variants), nsims, total_total_subject_settings))
#summarizing the counts per vaccine among cases. so want to add over all variants per vaccine
cell_counts_sims_total_by_brand = array(
c(
apply(cell_counts_sims, c(3), function(x){
cumsum_overall = rbind(0,colCumsums(x))
a = seq(1, nrow(x), by = 1 + total_case_variants)
b = seq(1 + total_case_variants, nrow(x), by = 1 + total_case_variants)
return(cumsum_overall[b + 1,] - cumsum_overall[a,])
})
),
dim = c(1+total_vaccine_brands, nsims, total_total_subject_settings)
)
#cell_counts_sims = rmultinom(n=nsims, size=missing_data_adjusted_total_subjects, prob = c(full_table))
#Haldane's correction
cell_counts_sims[cell_counts_sims==0] = 0.5
cell_counts_sims_total_by_brand[cell_counts_sims_total_by_brand==0] = 0.5
CONTROL_ROW = 1
brand1_denominator_indices = relative_VE_combn[BRAND1,] + CONTROL_ROW
brand1_numerator_indices = relative_VE_combn[BRAND1,]*(1 + total_case_variants) + CONTROL_ROW + relative_VE_combn[VARIANT,]
brand2_denominator_indices = relative_VE_combn[BRAND2,] + CONTROL_ROW
brand2_numerator_indices = relative_VE_combn[BRAND2,]*(1 + total_case_variants) + CONTROL_ROW + relative_VE_combn[VARIANT,]
#dimensions of the following are ncol(relative_VE_combn) x nsims
numerator_vaccine1 = cell_counts_sims[brand1_numerator_indices,,,drop=F]
denominator_vaccine1 = cell_counts_sims_total_by_brand[brand1_denominator_indices,,,drop=F]
numerator_vaccine2 = cell_counts_sims[brand2_numerator_indices,,,drop=F]
denominator_vaccine2 = cell_counts_sims_total_by_brand[brand2_denominator_indices,,,drop=F]
estimate = (numerator_vaccine1/denominator_vaccine1)/(numerator_vaccine2/denominator_vaccine2)
standard_error = sqrt(1/numerator_vaccine1 + 1/numerator_vaccine2 - 1/denominator_vaccine1 - 1/denominator_vaccine2)
confounder_adjusted_standard_error = standard_error / (1-confounder_adjustment_Rsquared)
log_estimate = log(estimate)
quantile_times_adjusted_standard_error = qnorm(1-alpha/2) * confounder_adjusted_standard_error
low = exp(log_estimate - quantile_times_adjusted_standard_error)
upp = exp(log_estimate + quantile_times_adjusted_standard_error)
#expected_VE = rowMeans(1-estimate, na.rm=T)
expected_VE = apply(X = 1-estimate, MARGIN = 3, FUN = rowMeans, na.rm=T)
avg_lower_limit = apply(X = 1-upp, MARGIN = 3, FUN = rowMeans, na.rm=T)
avg_upper_limit = apply(X = 1-low, MARGIN = 3, FUN = rowMeans, na.rm=T)
anticipated_VE = 1 - flat_conditional_table[brand1_numerator_indices]/flat_conditional_table[brand2_numerator_indices]
ret = data.frame(vaccine_1 = rep(paste("Brand", relative_VE_combn[BRAND1,]), total_total_subject_settings),
vaccine_2 = rep(ifelse(relative_VE_combn[BRAND2,]==0,
no = paste("Brand", relative_VE_combn[BRAND2,]),
yes = "Unvaccinated"), total_total_subject_settings),
variant = rep(paste("Variant", relative_VE_combn[VARIANT,]), total_total_subject_settings),
total_subjects = rep(total_subjects, each=ncol(relative_VE_combn)),
anticipated_VE = anticipated_VE,
expected_VE = c(expected_VE),
bias_VE = c(expected_VE - anticipated_VE),
avg_lower_limit = c(avg_lower_limit),
avg_upper_limit = c(avg_upper_limit),
alpha = alpha,
overall_vaccine_coverage = overall_vaccine_coverage,
overall_attack_rate_in_unvaccinated = overall_attack_rate_in_unvaccinated,
confounder_adjustment_Rsquared = confounder_adjustment_Rsquared,
prob_missing_data = prob_missing_data
)
ret = cbind(ret,
t(data.frame(brand_proportions_in_vaccinated,
row.names = paste0('brand_prop_', 1:total_vaccine_brands))),
t(data.frame(proportion_variants_in_unvaccinated_cases,
row.names = paste0('variant_prop_', 1:total_case_variants))))
return(ret)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.