#' @title Incremental Mixture Approximate Bayesian Computation (IMABC)
#'
#' @description Calibrates a model using the IMABC algorithm.
#'
#' @param target_fun A function that generate target values given parameters (i.e., `the model'). The use of
#' define_target_function is stronlgy advised to ensure that the function takes in the correct values and
#' correctly returns results.
#' @param priors A priors object created using define_priors. This contains information regarding the parameters that
#' are being calibrated. Is ignored if starting from previous results.
#' @param targets A targets object created using define_targets. This contains information regarding the target values
#' which will be used to evaluate simulated parameters. Is ignored if starting from previous results.
#' @param N_start numeric(1). The number of draws to simulate for the first iteration.
#' @param N_centers numeric(1). The number of centers to use for exploring the parameter space.
#' @param Center_n numeric(1). The number of points to add around each center
#' @param N_cov_points numeric(1). The minimum number of points used to estimate the covariance matrix of valid parameters
#' nearest each center point. The covariance matrix is used when simulating new parameter draws around the center. If 0
#' (default), uses 25*number of parameters.
#' @param N_post numeric(1). The weighted sample size that must be achieved using valid parameter values in order to stop
#' algorithm.
#' @param sample_inflate numeric(1). When generating new results for a given center, how many additional samples should be
#' simulated to ensure enough valid (within range) parameters draws are simulated for the center.
#' @param max_iter numeric(1). The maximum number of iterations to attempt.
#' @param seed numeric(1). The seed to set for reproducibility.
#' @param latinHypercube logical(1). Should algorithm use a Latin Hypercube to generate first set of parameters.
#' @param backend_fun function. For advanced users only. Lets to user evaluate the target function(s) using their own
#' backend, i.e., simulate targets with an alternative parallel method. Only necessary if the backend method is
#' not compatible with foreach. See details for requirements.
#' @param output_directory character(1). Path to save results to. If NULL (default), no results are saved. If a path is
#' provided results are saved/updated every iteration. See details for more information.
#' @param output_tag character(1). Tag to add to result files names. "timestamp" (default) is a special code that adds
#' the time and date the code was executed.
#' @param previous_results_dir Optional character(1). Path to results stored during a previous run. If the user wishes to
#' restart a run that didn't complete the calibration, they can continue by using the outputs stored during the previous
#' run.
#' @param previous_results_tag Optional character(1). The tag that was added to the previous run output files.
#' @param verbose logical(1). Prints out progress messages and additional information as the model works.
#' @param validate_run logical(1). If this is TRUE and an output_directory is specified, the function will save all
#' parameters generated by the model - even ones that were deemed invalid based on their simulated targets.
#'
#' @details
#' The user specifies the calibrated parameters, their prior distributions, calibration targets with initial and
#' final acceptance intervals, and the function (i.e., the model) used to generate targets given calibrated parameters
#' The algorithm begins by drawing a sample of vectors from the parameter space based on prior distributions.
#' This initial sample can be drawn using a Latin hypercube. The algorithm identifies and retains parameter
#' vectors that result in generated targets that are within the current acceptance intervals. The algorithm iteratively
#' updates this sample and narrows the acceptance intervals until either 1) the algorithm reaches the final
#' acceptance intervals around each target and identifies the requested sample of parameter vectors that generate
#' targets within these acceptance intervals, or the algorithm completes the maximum number of iterations.
#' The algorithm can be restarted to continue iterating.
#'
#' A technical description of the imabc algorithm is provided in
#' Rutter CM, Ozik J, DeYoreo M, Collier N. Microsimulation model calibration using incremental mixture approximate
#' Bayesian computation. Ann. Appl. Stat. 13 (2019), no. 4, 2189-2212. doi:10.1214/19-AOAS1279.
#' https://projecteuclid.org/euclid.aoas/1574910041.
#'
#' The imabc package implements a small modification to the approach described in the 2019 AOAS paper. In the imabc package,
#' the user specifies initial and final acceptance intervals directly. This approach is more flexible than the
#' approach described in the paper and more easily incorporates asymmetric acceptance intervals.
#'
#' ## N_cov_points relation to the number of parameters:
#' When the algorithm has enough quality draws, it estimates the covariance between parameters and uses these relations
#' in order to improve future simulations of parameters. However, this can only work if the covariance matrix is not
#' singular. When a covariance matrix is singular, imabc will replace it with an independent covariance matrix (a diagonal
#' matrix of the variances of the parameters) to avoid any calculation errors. Setting N_cov_points to be less than the
#' number of parameters will lead to singularness in a covariance matrix. The algorithm can still run but will be not as
#' efficient or may not be able to calibrate completely.
#'
#' ## Custom Backend Function:
#' The primary run handler takes each row from the simulated draws and provides the appropriate information to the
#' target_fun function as inputs. This includes pulling the parameter values as a named vector, pulling a unique seed
#' generated for each set of parameters, as well as passing the current priors and targets objects. This is done using
#' the foreach function from the foreach package. This allows the user to register their own preferred parallel backend
#' before running the imabc function so long as it is compatible with foreach. If the user does not provide a parallel
#' backend, foreach will run the analysis in sequence by default and provide a warning indicating such the first time
#' the imabc function is run within a session.
#'
#' However, since not all parallel backends are compatible with this method, we have provided a way for the user to add
#' their own run handling method. To utilize this feature, the user must create a function that meets a couple requirements
#' in order to work properly.
#'
#' The first requirement is that the backend function have inputs in the following order:
#' the data.table of all parameters to be evaluated, the names of all the parameters being calibrated, the target function
#' to be used for evaluating parameters, a list that includes the priors object and the targets object. The user can name
#' these inputs whatever they prefer but the correct order and number of inputs will be expected (i.e. the user must create
#' a function with four inputs, the first will be the parameter data.table, and so on.). The user can utilize any piece
#' of info passed to these parameters as well. This includes unique seed values passed as a column of the parameter data.table
#' (called "seed"), and the current targets and priors objects passed in the fourth input. The priors and targets objects
#' are named priors and targets respectively in the fourth input list.
#'
#' The last requirement is that the returned object be a data.table of simulated target values. Each row represents a set
#' of results from the target_fun for a given set of parameters and each column represents a target value based on the
#' targets object. If the final output of the custom backend returns a data.table with column names identical to the
#' target names, the order of the columns will be verified by imabc. If the final output of the backend does not include
#' column names that match the target names, the user must ensure that they are in the same order as the targets object.
#' If they are not in the appropriate order, information may be attached to the wrong target and lead to errors.
#'
#' Do not use the custom backend unless you are confident you understand what is expected of the run handler. To get a
#' better understanding of what is being done run View(imabc:::run_handler) in the console to see how the backend_fun is
#' being used.
#'
#' ## Output Files:
#' If an output directory is specified files are saved for each of the objects returned by the function. They are named
#' as follows:
#' * Good_SimulatedParameters_tag.csv = good_parm_draws
#' * Good_SimulatedTargets_tag.csv = good_sim_target
#' * Good_SimulatedDistances_tag.csv = good_target_dist
#' * MeanCovariance_tag.csv = mean_cov
#' * CurrentPriors_tag.csv = priors
#' * CurrentTargets_tag.csv = targets
#' * RunMetadata_tag.csv = metaddata
#'
#' if validate_run = TRUE, includes:
#' * SimulatedParameters_tag.csv = all_iter_parm_draws
#' * SimulatedTargets_tag.csv = all_iter_sim_target
#' * SimulatedDistances_tags.csv = all_iter_target_dist
#' @md
#'
#' @return A list with:
#'
#' * good_parm_draws - a data.table of valid parameters for the current target bounds
#' * good_sim_target - a data.table of simulated target results from good_parm_draws parameters
#' * good_target_dist - a data.table of distances based on simulated good target results
#' * mean_cov - a data.frame of the means and covariances of parameters for iterations that had more good parameters than
#' N_cov_points
#' * priors - The prior object with empirical standard deviation from first N_start generated values
#' * targets - The target object with updated bounds based on calibration
#' * metaddata - Important info regarding the function inputs and current set of results including current_iteration (the
#' last iteration that completed) and last_draw (the total number of draws simulated during execution)
#'
#' if validate_run = TRUE, includes:
#' * all_iter_parm_draws - all parameters generated by the algorithm, even ones that results in target values outside of
#' the current target bounds
#' * all_iter_sim_target - all simulated target values from the parameters in all_iter_parm_draws
#' * all_iter_target_dist - all distances based on simulated target results
#'
#' @import MASS data.table foreach parallel truncnorm lhs
#' @importFrom methods formalArgs
#' @importFrom stats cov dnorm runif sd
#' @importFrom utils flush.console read.csv read.table setTxtProgressBar txtProgressBar write.table
#' @export
imabc <- function(
target_fun,
priors = NULL,
targets = NULL,
N_start = 1,
N_centers = 1,
Center_n = 50,
N_cov_points = 0,
N_post = 100,
sample_inflate = 1.5,
max_iter = 1000,
seed = NULL,
latinHypercube = TRUE,
backend_fun = NULL,
output_directory = NULL,
output_tag = "timestamp",
previous_results_dir = NULL,
previous_results_tag = NULL,
verbose = TRUE,
validate_run = TRUE
) {
# Environment setup ---------------------------------------------------------------------------------------------------
# For CRAN checks
sample_wt <- tot_dist <- step <- iter <- n_good <- scaled_dist <- draw <- NULL
# Check for priors and targets or previous_results directory
if ((is.null(priors) || is.null(targets)) & is.null(previous_results_dir)) {
stop("Must provide both a priors object and a targets object or provide a path to previous results.")
}
# If path to previous results are provided, load the results into environment
if (!is.null(previous_results_dir)) {
# Warn about using priors/targets from previous results
if (!is.null(priors)) {
warning("priors object provided is ignored in favor of priors object derived from previous results.")
}
if (!is.null(targets)) {
warning("targets object provided is ignored in favor of targets object derived from previous results.")
}
# Read in previous results
last_run <- read_previous_results(path = previous_results_dir, tag = previous_results_tag)
priors <- last_run$new_priors
targets <- last_run$new_targets
previous_results <- last_run$previous_results
continue_runs <- TRUE
rm(last_run)
# Check format of data.tables loaded against objects loaded
check_format(
imabc_obj = priors,
tracking_dt = previous_results$good_parm_draws
)
check_format(
imabc_obj = targets,
tracking_dt = previous_results$good_sim_target,
dist_dt = previous_results$good_target_dist
)
# Previous Run info
previous_iter <- as.numeric(previous_results$prev_run_meta$value[previous_results$prev_run_meta$info == "current_iteration"])
previous_draw <- as.numeric(previous_results$prev_run_meta$value[previous_results$prev_run_meta$info == "last_draw"])
# Ensure mean_cov has proper number of iterations
if (any(previous_results$mean_cov$iter > previous_iter)) {
previous_results$mean_cov <- previous_results$mean_cov[previous_results$mean_cov$iter <= previous_iter, ]
}
# Make sure mean and covariance data is in proper format
mean_cov <- data.table(previous_results$mean_cov)
} else {
previous_results <- NULL
continue_runs <- FALSE
}
# Technical run specifics
run_timestamp <- Sys.time()
print_time_fmt <- "%a %b %d %X %Y"
write_time_fmt <- "%Y%m%d_%H%M%Z"
append_meancov_outfile <- FALSE
append_iter_outfile <- FALSE
start_iter <- ifelse(continue_runs, previous_iter, 1)
end_iter <- (start_iter - !continue_runs) + max_iter
# Data size specifics
total_draws <- ifelse(continue_runs, previous_draw, 0)
n_draw <- ifelse(continue_runs, N_centers*Center_n, N_start)
n_rows_init <- max(n_draw, N_centers*Center_n) + N_centers
n_store <- max((N_post + N_centers*(Center_n + 1)), 2*N_cov_points + 1)
ESS <- 0 # Effective Sample Size
# Create output directory if it doesn't already exist
if (!is.null(output_directory)) {
dir.create(output_directory, showWarnings = FALSE, recursive = TRUE)
}
# Randomization method (must be L'Ecuyer-CMRG for parallelization)
og_kind <- RNGkind()
RNGkind(kind = "L'Ecuyer-CMRG")
on.exit(RNGkind(kind = og_kind[1], normal.kind = og_kind[2], sample.kind = og_kind[3]), add = TRUE)
set.seed(seed)
seed_stream_start <- .Random.seed
# File names for saving
output_tag <- ifelse(output_tag == "timestamp", format(run_timestamp, write_time_fmt), output_tag)
targlist_df_outfile <- ifelse(is.null(output_tag), "CurrentTargets.csv", sprintf("CurrentTargets_%s.csv", output_tag))
priolist_df_outfile <- ifelse(is.null(output_tag), "CurrentPriors.csv", sprintf("CurrentPriors_%s.csv", output_tag))
parm_df_outfile <- ifelse(is.null(output_tag), "Good_SimulatedParameters.csv", sprintf("Good_SimulatedParameters_%s.csv", output_tag))
simtarg_df_outfile <- ifelse(is.null(output_tag), "Good_SimulatedTargets.csv", sprintf("Good_SimulatedTargets_%s.csv", output_tag))
targdist_df_outfile <- ifelse(is.null(output_tag), "Good_SimulatedDistances.csv", sprintf("Good_SimulatedDistances_%s.csv", output_tag))
meancov_outfile <- ifelse(is.null(output_tag), "MeanCovariance.csv", sprintf("MeanCovariance_%s.csv", output_tag))
runmeta_df_outfile <- ifelse(is.null(output_tag), "RunMetadata.csv", sprintf("RunMetadata_%s.csv", output_tag))
# Print ---------------------------------------------------------------------------------------------------------------
if (verbose) {
timeinfo <- sprintf("----- %s -----", format(run_timestamp, print_time_fmt))
headerinfo <- ifelse(continue_runs, "Continuing IMABC with specifications:", "New IMABC with specifications:")
iterinfo <- sprintf("Max iterations = %s", max_iter)
nstartinfo <- sprintf("Starting simulation size = %s", N_start)
ncentinfo <- sprintf("Number of centers of mass = %s", N_centers)
centninfo <- sprintf("Number of samples around each center = %s", Center_n)
npostinfo <- sprintf("Minimum valid samples to end calibration = %s", N_post)
cat(sep = "\n", timeinfo, headerinfo, iterinfo, nstartinfo, ncentinfo, centninfo, npostinfo)
}
# Distances Handling --------------------------------------------------------------------------------------------------
# All the distances that are being calculated (names of non-grouped targets + names of groups of targets)
target_distance_names <- unique(attr(targets, "target_groups"))
n_target_distances <- length(target_distance_names)
# Number of draws in range of targets from previous run. 0 if no previous run was supplied
current_good_n <- ifelse(continue_runs, sum(previous_results$good_target_dist$n_good == n_target_distances, na.rm = TRUE), 0)
# Initialize current iteration distance data.table
iter_target_dist <- init_iter_dt(n_row = n_rows_init, cols = target_distance_names, type = "targ_dists")
iter_target_dist[draw > n_draw, draw := NA]
# Initialize good distance data.table or add rows to previous good results
good_target_dist <- init_good_dt(
final_n = n_store, current_n = current_good_n, previous_dt = previous_results$good_target_dist,
cols = target_distance_names, type = "targ_dists"
)
# Targets Handling ----------------------------------------------------------------------------------------------------
# All the targets that are being simulated
sim_target_names <- attr(targets, "target_names")
# Initialize current iteration simulated target data.table
iter_sim_target <- init_iter_dt(n_row = n_rows_init, cols = sim_target_names, type = "sim_targs")
iter_sim_target[draw > n_draw, draw := NA]
# Initialize good distance data.table or add rows to previous good results
good_sim_target <- init_good_dt(
final_n = n_store, current_n = current_good_n, previous_dt = previous_results$good_sim_target,
cols = sim_target_names, type = "sim_targs"
)
# Parameter Handling --------------------------------------------------------------------------------------------------
# All parameters used for calculating targets
all_parm_names <- names(priors)
calibr_parm_names <- all_parm_names[attr(priors, "distributions") != "fixed"]
n_parms <- length(all_parm_names)
if (N_cov_points > 0 & N_cov_points < n_parms) {
warning("Setting N_cov_points < the number of parameters is not advised.\nSee `?imabc` Details for more info.")
}
if (N_cov_points == 0) { N_cov_points <- 25*n_parms }
# Initialize current iteration simulated parameters data.table
iter_parm_draws <- init_iter_dt(n_row = n_rows_init, cols = all_parm_names, type = "parm_draws")
iter_parm_draws[draw > n_draw, draw := NA]
# Initialize good distance data.table or add rows to previous good results
good_parm_draws <- init_good_dt(
final_n = n_store, current_n = current_good_n, previous_dt = previous_results$good_parm_draws,
cols = all_parm_names, type = "parm_draws"
)
# Generate first set of simulated parameters
iter_parm_draws$seed[1:n_draw] <- seed_stream(seed_stream_start, n_draw)
seed_stream_start <- as.integer(unlist(strsplit(iter_parm_draws[["seed"]][n_draw], "_")))
if (!continue_runs) {
# Generate random inputs for prior distribution calculation
if (latinHypercube) {
u_draws <- lhs::randomLHS(N_start, n_parms)
} else {
u_draws <- matrix(runif(N_start*n_parms), nrow = N_start, ncol = n_parms)
}
colnames(u_draws) <- all_parm_names
# Generate parameter space from prior distribution functions
iter_parm_draws <- parms_from_priors(
parm_df = iter_parm_draws,
name_parms = all_parm_names,
prior_list = priors,
sampling = u_draws
)
# Get empirical standard deviation of parameters
priors <- update_parm_sds(priors, dt = iter_parm_draws, parms = all_parm_names)
} # !continue_runs
# Save results
if (!is.null(output_directory)) {
save_results(list(as.data.frame(priors), priolist_df_outfile), out_dir = output_directory, append = FALSE)
}
# Print ---------------------------------------------------------------------------------------------------------------
if (verbose) {
bound_info <- lapply(attr(targets, "target_names"), FUN = function(x, targ) {
tmp <- targ[x]
paste(sprintf("%s: %s - %s", x, tmp$current_lower_bounds, tmp$current_upper_bounds), collapse = "\n")
}, targ = targets)
bound_info <- lapply(unique(attr(targets, "target_groups")), FUN = function(x, groups, bound_info) {
paste(unlist(bound_info[x == groups]), collapse = "\n")
}, groups = attr(targets, "target_groups"), bound_info = bound_info)
bound_info <- paste(sprintf("---- %s ----", unique(attr(targets, "target_groups"))), bound_info, sep = "\n", collapse = "\n")
cat("Current bound info:", bound_info, sep = "\n")
}
# Main Loop -----------------------------------------------------------------------------------------------------------
for (main_loop_iter in start_iter:end_iter) {
# valid draws generated from points simulated in the iteration
iter_valid_n <- 0
# What targets are left to update based on stopping bounds
update_targets <- unique(attr(targets, "update"))
# Continuation run first iteration flag
is_first_continue_iter <- continue_runs == TRUE & main_loop_iter == start_iter
# Print -------------------------------------------------------------------------------------------------------------
if (verbose & !is_first_continue_iter) {
header <- sprintf("\n---------- Start Iter %s ----------", main_loop_iter)
iter_info <- sprintf("Starting at: %s", Sys.time())
n_info <- sprintf("Current in range draws = %s", current_good_n)
if (length(update_targets) > 0) {
n_info <- sprintf("New in range draws = %s", current_good_n)
bounds_to_print <- names(attr(targets, "update"))
bound_info <- lapply(attr(targets,"target_names"), FUN = function(x, targ, to_print) {
if (x %in% to_print) {
tmp <- targ[x]
y <- paste(sprintf("%s: %s - %s", x, tmp$current_lower_bounds, tmp$current_upper_bounds), collapse = "\n")
} else {
y <- paste(sprintf("%s: Reached stopping bounds", x), collapse = "\n")
}
return(y)
}, targ = targets, to_print = bounds_to_print)
bound_info <- lapply(unique(attr(targets, "target_groups")), FUN = function(x, groups, bound_info) {
paste(unlist(bound_info[x == groups]), collapse = "\n")
}, groups = attr(targets, "target_groups"), bound_info = bound_info)
bound_info <- paste(sprintf("---- %s ----", unique(attr(targets, "target_groups"))), bound_info, sep = "\n", collapse = "\n")
bound_info <- ifelse(bound_info == "", "All tolerance intervals at target bounds", bound_info)
cat(header, iter_info, n_info, "Current bound info:", bound_info, sep = "\n")
} else {
cat(header, iter_info, "All tolerance intervals at target bounds", sep = "\n")
}
}
# Evaluate Draws ----------------------------------------------------------------------------------------------------
# If not a continuing runs first iteration
if (!is_first_continue_iter) {
# Evaluate simulated draws using target function(s)
parms_to_run <- iter_parm_draws[1:n_draw]
iter_sim_results <- run_handler(
parms_to_run = parms_to_run, target_fun = target_fun, all_parm_names = all_parm_names,
targets = targets, priors = priors, verbose = verbose, custom_function = backend_fun
)
# Ensure order of columns matches order of sim_target_names if names exist
if (all(names(iter_sim_results) %in% colnames(iter_sim_target))) {
iter_sim_results <- iter_sim_results[, match(sim_target_names, colnames(iter_sim_results))]
}
# Store results
total_draws <- total_draws + n_draw - sum(iter_parm_draws$step %in% (N_centers + 1))
iter_sim_target[1:n_draw, (sim_target_names) := iter_sim_results]
# Evaluate Targets, first based on whether they are in the appropriate range, second by euclidian distance
iter_target_dist[, (target_distance_names) := eval_targets(sim_targets = iter_sim_target, target_list = targets, criteria = "start")]
iter_target_dist$tot_dist <- total_distance(dt = iter_target_dist, target_names = target_distance_names, scale = FALSE)
# Count the good points (points associated with positive distances)
iter_target_dist$n_good[1:n_draw] <- rowSums(iter_target_dist[1:n_draw, (target_distance_names), with = FALSE] >= 0, na.rm = TRUE)
iter_valid_n <- sum(iter_target_dist$n_good[iter_target_dist$step <= N_centers] == n_target_distances, na.rm = TRUE)
# If user wants to explore results of all simulated parameters
if (validate_run) {
if (!append_iter_outfile) {
all_iter_parm_draws <- iter_parm_draws[!is.na(iter_parm_draws$draw), ]
all_iter_sim_target <- iter_sim_target[!is.na(iter_sim_target$draw), ]
all_iter_target_dist <- iter_target_dist[!is.na(iter_target_dist$draw), ]
} else {
all_iter_parm_draws <- rbind(all_iter_parm_draws, iter_parm_draws[!is.na(iter_parm_draws$draw), ])
all_iter_sim_target <- rbind(all_iter_sim_target, iter_sim_target[!is.na(iter_sim_target$draw), ])
all_iter_target_dist <- rbind(all_iter_target_dist, iter_target_dist[!is.na(iter_target_dist$draw), ])
}
if (!is.null(output_directory)) {
save_results(
list(iter_parm_draws[!is.na(iter_parm_draws$draw), ][, scaled_dist:= NULL], gsub("Good_", "", parm_df_outfile)),
list(iter_sim_target[!is.na(iter_sim_target$draw), ], gsub("Good_", "", simtarg_df_outfile)),
list(iter_target_dist[!is.na(iter_target_dist$draw), ], gsub("Good_", "", targdist_df_outfile)),
out_dir = output_directory, append = append_iter_outfile
)
}
# append_iter_outfile <- TRUE
}
# Update centers in good_* which have been re-simulated since last iteration
if (main_loop_iter > 1) {
# What were the centers from last time
center_draw <- iter_parm_draws$draw[iter_parm_draws$step == (N_centers + 1) & !is.na(iter_parm_draws$step)]
# Sort to ensure alignment across all data tables
center_draw <- sort(center_draw, na.last = TRUE)
setorder(iter_parm_draws, draw, na.last = TRUE)
setorder(good_parm_draws, draw, na.last = TRUE)
setorder(iter_sim_target, draw, na.last = TRUE)
setorder(good_sim_target, draw, na.last = TRUE)
setorder(iter_target_dist, draw, na.last = TRUE)
setorder(good_target_dist, draw, na.last = TRUE)
# Replace old center values with current iteration centers
good_parm_draws[draw %in% center_draw, c("iter", "draw", "step", "seed")] <-
iter_parm_draws[draw %in% center_draw, c("iter", "draw", "step", "seed"), with = FALSE]
good_sim_target[draw %in% center_draw, c("iter", "draw", "step", sim_target_names)] <-
iter_sim_target[draw %in% center_draw, c("iter", "draw", "step", sim_target_names), with = FALSE]
good_target_dist[draw %in% center_draw, c("iter", "draw", "step", target_distance_names, "n_good")] <-
iter_target_dist[draw %in% center_draw, c("iter", "draw", "step", target_distance_names, "n_good"), with = FALSE]
good_target_dist$n_good[is.na(good_target_dist$n_good)] <- 0L
# Determine which center draws are still valid (to keep) and which aren't (to remove)
remove_draws <- good_target_dist[((draw %in% center_draw) & (n_good < n_target_distances)), ]$draw
keep_draws <- good_target_dist[((draw %in% center_draw) & (n_good == n_target_distances)), ]$draw
# Update good draw distance and stopping criteria
if (length(keep_draws) > 0) {
# if there are any centers that are kept, recalculate
if (length(update_targets) == 0) {
good_target_dist[draw %in% keep_draws, ]$tot_dist <- total_distance(
dt = good_target_dist[draw %in% keep_draws, ],
target_names = target_distance_names,
scale = FALSE
)
} else { # length(update_targets) == 0
good_target_dist[draw %in% keep_draws, ]$tot_dist <- total_distance(
dt = good_target_dist[draw %in% keep_draws, ],
target_names = update_targets,
scale = FALSE
)
} # ! length(update_targets) == 0
} # length(keep_draws) > 0
# Remove bad draws
if (length(remove_draws) > 0) {
# Print information
if (verbose) {
cat("Removing centers as good draws:", paste(remove_draws, collapse = ", "), "\n")
}
# Remove bad draws
good_parm_draws[draw %in% remove_draws, names(good_parm_draws) := NA]
good_sim_target[draw %in% remove_draws, names(good_sim_target) := NA]
good_target_dist[draw %in% remove_draws, names(good_target_dist) := NA]
# place removed draws at the bottom
setorder(good_parm_draws, draw, iter, step, na.last = TRUE)
setorder(good_sim_target, draw, iter, step, na.last = TRUE)
setorder(good_target_dist, draw, iter, step, na.last = TRUE)
} # length(remove_draws) > 0
# After replacing recalculated centers in good_* data tables, remove them from the iteration-specific data tables
iter_parm_draws[step == (N_centers + 1), (all_parm_names) := NA_real_]
iter_sim_target[step == (N_centers + 1), (sim_target_names) := NA_real_]
iter_target_dist[step == (N_centers + 1), c(target_distance_names, "tot_dist") := NA_real_]
iter_target_dist[step == (N_centers + 1), n_good := 0L]
# Total draws to be kept
current_good_n <- sum(good_target_dist$n_good == n_target_distances, na.rm = TRUE)
} # if (main_loop_iter > 1)
# Stop if there are no close points (and so cannot continue)
stopifnot("No valid parameters to work from." = current_good_n + iter_valid_n > 0)
# Save good draws, targets, and distances if any were found
if (iter_valid_n > 0) {
# Determine rows that will be kept in good_*
n_keep <- ifelse(main_loop_iter != 1 & (current_good_n + iter_valid_n) > n_store, n_store - iter_valid_n, current_good_n)
if (main_loop_iter != 1 & (current_good_n + iter_valid_n) > n_store) {
keep_draws <- good_target_dist$draw[order(-good_target_dist$n_good, good_target_dist$tot_dist)][1:n_keep]
good_parm_draws[1:n_keep, ] <- good_parm_draws[draw %in% keep_draws, ]
good_sim_target [1:n_keep, ] <- good_sim_target[draw %in% keep_draws, ]
good_target_dist[1:n_keep, ] <- good_target_dist[draw %in% keep_draws, ]
}
# Determine rows that will be updated in good_*
start_add_row <- n_keep + 1
end_add_row <- ifelse((current_good_n + iter_valid_n) > n_store, n_store, current_good_n + iter_valid_n)
update_row_range <- start_add_row:end_add_row
# Determine rows that will be used to update from iter_*
valid_rows <- iter_target_dist$n_good == n_target_distances & iter_target_dist$step < (N_centers + 1)
order_valid_rows <- order(-iter_target_dist$n_good[valid_rows], iter_target_dist$tot_dist[valid_rows])
add_draws <- sort(iter_target_dist$draw[valid_rows][order_valid_rows][1:length(update_row_range)])
# Transfer good rows to results data.tables
good_parm_draws[update_row_range, ] <- iter_parm_draws[draw %in% add_draws, ]
good_sim_target[update_row_range, ] <- iter_sim_target[draw %in% add_draws, ]
good_target_dist[update_row_range, ] <- iter_target_dist[draw %in% add_draws, ]
# Make sure total distance is correct in good_target_dist
if (length(update_targets) == 0) {
good_target_dist[update_row_range, tot_dist := total_distance(
dt = good_target_dist[update_row_range, ],
target_names = target_distance_names,
scale = FALSE
)]
} else {
good_target_dist[update_row_range, tot_dist := total_distance(
dt = good_target_dist[update_row_range, ],
target_names = update_targets,
scale = FALSE
)]
}
} # iter_valid_n > 0
# Update count of final valid draws
current_good_n <- sum(good_target_dist$n_good == n_target_distances, na.rm = TRUE)
} # !(continue_runs == TRUE & iter == start_iter)
# Which valid rows should be used for simulating new draws
good_row_range <- which(good_target_dist$n_good == n_target_distances)
# Improve Target Bounds ---------------------------------------------------------------------------------------------
if ((current_good_n >= 2*N_cov_points) & length(update_targets) > 0) {
# Sort targets based on overall distance of target distances still being calibrated
good_target_dist$tot_dist <- total_distance(dt = good_target_dist, target_names = update_targets, scale = FALSE)
# Find least amount of points that move us towards the stopping bounds while letting us have enough for the more
# complex resampling method
keep_points <- c(trunc(seq(1.0, 2.0, 0.1)*N_cov_points), current_good_n)
for (test_n_iter in keep_points) {
# If we can use less than the entire sample, calculate bounds with the subset. Otherwise, bounds do not change
if (test_n_iter < current_good_n) {
# Get the test_n_iter best draws to determine an empirical set of new bounds
get_draws <- good_target_dist$draw[order(good_target_dist$tot_dist)][1:test_n_iter]
# Get new bounds using test_n_iter best draws
targets <- get_new_bounds(
to_update = update_targets,
targets_list = targets,
sims = good_sim_target[draw %in% get_draws, ]
)
} else { # test_n_iter < current_good_n
# set the new bounds back to the current bounds
targets <- update_target_bounds(targets, from = "new", to = "current")
} # ! test_n_iter < current_good_n
# Find if any targets have moved closer to the stopping bounds
improve <- any(
targets$new_lower_bounds != targets$current_lower_bounds |
targets$new_upper_bounds != targets$current_upper_bounds
)
# If we can improve any targets, test our current valid draws against new bounds OR
# If we have no more sample to use, reset our evaluation of good draws to the current bounds set
if (improve | test_n_iter == current_good_n) {
# Update distances (values < 0 indicate out of tolerance bounds)
good_target_dist[good_row_range, (update_targets) := eval_targets(
sim_targets = good_sim_target[good_row_range, ],
target_list = targets[update_targets, groups_given = TRUE],
criteria = "update"
)]
# Update number of targets that were hit for each draw
good_target_dist$n_good <- 0L
good_target_dist[good_row_range]$n_good <- rowSums(
good_target_dist[good_row_range, (target_distance_names), with = FALSE] >= 0,
na.rm = TRUE
)
# Calculate new number of valid draws
new_iter_valid_n <- sum(good_target_dist$n_good == n_target_distances, na.rm = TRUE)
# If we have enough new draws, stop looking for more by stopping the test_n_iter loop
if (new_iter_valid_n >= N_cov_points) { break }
} # any(improve) | test_n_iter == current_good_n
} # test_n_iter in keep_points
# If we consticted our target ranges with fewer draws
if (new_iter_valid_n >= N_cov_points & new_iter_valid_n < current_good_n) {
# Update current bounds to the newly calculated bounds
targets <- update_target_bounds(targets, from = "current", to = "new")
# Update current_good_n
current_good_n <- new_iter_valid_n
# Determine which distances have been constricted as far as we want
update_targets <- attr(targets, "update") <- get_update_targets(targets)
# Move the best draws to top of good_target_dist and remove the rest
keep_draws <- good_target_dist$draw[good_target_dist$n_good == n_target_distances]
remove_rows <- !good_target_dist$draw %in% keep_draws
good_parm_draws[remove_rows, c("draw", "step", "iter") := NA_integer_]
good_parm_draws[remove_rows, c(all_parm_names, "scaled_dist", "sample_wt") := NA_real_]
good_parm_draws[remove_rows, "seed" := NA_character_]
good_sim_target[remove_rows, c("draw", "step", "iter") := NA_integer_]
good_sim_target[remove_rows, (sim_target_names) := NA_real_]
good_target_dist[remove_rows, c("draw", "step", "iter", "n_good") := NA_integer_]
good_target_dist[remove_rows, c(target_distance_names, "tot_dist") := NA_real_]
setorder(good_parm_draws, draw, na.last = TRUE)
setorder(good_sim_target, draw, na.last = TRUE)
setorder(good_target_dist, draw, na.last = TRUE)
# Update the good_row_range for later use
good_row_range <- 1:current_good_n
updated_bounds <- TRUE # For next print
} else { # new_iter_valid_n >= N_cov_points & new_iter_valid_n < current_good_n
updated_bounds <- FALSE # For next print
} # ! new_iter_valid_n >= N_cov_points & new_iter_valid_n < current_good_n
# Print -----------------------------------------------------------------------------------------------------------
if (verbose & !is_first_continue_iter) {
n_info <- sprintf("New in range draws = %s", current_good_n)
bounds_to_print <- names(attr(targets, "update"))
bound_info <- lapply(attr(targets,"target_names"), FUN = function(x, targ, to_print) {
if (x %in% to_print) {
tmp <- targ[x]
y <- paste(sprintf("%s: %s - %s", x, tmp$current_lower_bounds, tmp$current_upper_bounds), collapse = "\n")
} else {
y <- paste(sprintf("%s: Reached stopping bounds", x), collapse = "\n")
}
return(y)
}, targ = targets, to_print = bounds_to_print)
bound_info <- lapply(unique(attr(targets, "target_groups")), FUN = function(x, groups, bound_info) {
paste(unlist(bound_info[x == groups]), collapse = "\n")
}, groups = attr(targets, "target_groups"), bound_info = bound_info)
bound_info <- paste(sprintf("---- %s ----", unique(attr(targets, "target_groups"))), bound_info, sep = "\n", collapse = "\n")
bound_info <- ifelse(bound_info == "", "All tolerance intervals at target bounds", bound_info)
if (updated_bounds) {
cat("Updated bounds:", n_info, bound_info, sep = "\n")
} else {
cat("Unable to update bounds:", n_info, bound_info, sep = "\n")
}
}
} # (current_good_n >= 2*N_cov_points) & length(update_targets) > 0
# Calculate Effective Sample Size -----------------------------------------------------------------------------------
# Only really need weights in these scenarios but will calculate everytime for potential debugging
# if ((current_good_n >= N_post | length(update_targets) == 0 | (main_loop_iter >= end_iter & current_good_n > 0)) &
if (main_loop_iter > 1) {
# Calculate sample weight
good_parm_draws$sample_wt <- 0
in_draws <- good_target_dist$draw[!is.na(good_target_dist$draw)]
good_parm_draws[good_parm_draws$draw %in% in_draws, sample_wt :=
get_weight(
parms = good_parm_draws[draw %in% in_draws, ],
parm_names = calibr_parm_names,
priors = priors[calibr_parm_names],
mixture_file = mean_cov,
n = N_start
)]
# Calculate effective sample size using Kish formula. Here sum(sample_wt) = 1
ESS <- 1/sum(good_parm_draws$sample_wt[good_parm_draws$draw %in% in_draws]^2)
# Print information
if (verbose & !is_first_continue_iter) {
cat(sprintf("Effective Sample Size is %s", round(ESS, 2)), "\n")
}
} # current_good_n >= N_post | length(update_targets) == 0 | (main_loop_iter >= end_iter & current_good_n > 0)
# Check Completion --------------------------------------------------------------------------------------------------
if (ESS >= N_post & length(update_targets) == 0) {
# Print information
if (verbose) {
n_info <- sprintf("Final in range draws = %s", current_good_n)
ess_info <- sprintf("Effective Sample Size was %s", round(ESS, 2))
target_info <- sprintf("(Target was %s)", N_post)
cat(n_info, ess_info, target_info, sep = "\n")
}
# Have met the appropriate sample size. Stop looking and return final results
break
}
# Simulate New Draws ------------------------------------------------------------------------------------------------
if (main_loop_iter < end_iter || main_loop_iter == 1) {
# Find the N_center in range draws with model predictions closest to targets
# Check for high-weight points, defined as $\theta_i$ with $w_i> 10/\sum_{i=1}^{N_{(t+1)}}$, meaning that the
# maximum weight is 10 times greater than expected for a simple random sample from in-range points
n_hiwt <- 0
center_draw_hiwt <- NULL
if (length(update_targets) == 0 & main_loop_iter > 1) {
max_wt <- max(good_parm_draws$sample_wt[good_parm_draws$draw %in% in_draws])
if (max_wt >= 10/current_good_n) {
draw_order <- setorder(good_parm_draws[good_row_range, ], -sample_wt, na.last = TRUE)$draw
n_hiwt <- min(N_centers, length(draw_order))
center_draw_hiwt <- draw_order[1:n_hiwt]
# Print information
if (verbose) {
cat("Adding samples around high weight points", paste(center_draw_hiwt, collapse = ", "), "\n")
}
}
} # length(update_targets) == 0
# If we didn't get enough centers from high-weight points, use total distance instead
n_best_draw <- 0
center_draw_best <- NULL
if (n_hiwt < N_centers) {
# Pull the draw numbers while ordering them based on their total distance
draw_order <- good_target_dist$draw[good_row_range][order(good_target_dist$tot_dist[good_row_range])]
n_best_draw <- min(current_good_n, (N_centers - n_hiwt))
center_draw_best <- draw_order[1:n_best_draw]
} # n_hiwt < N_centers
# Create center info objects for calculations
num_centers <- n_best_draw + n_hiwt
center_draw <- sort(c(center_draw_best, center_draw_hiwt))
center_next <- as.matrix(good_parm_draws[draw %in% center_draw, all_parm_names, with = FALSE])
# Calculate number of new draws and new steps (excluding centers)
n_draw <- num_centers*Center_n # + num_centers
new_steps <- rep(1:num_centers, each = Center_n) # rep.int((N_centers + 1), num_centers)
# Reset iteration level info
iter_parm_draws$iter <- iter_sim_target$iter <- iter_target_dist$iter <- main_loop_iter + 1
iter_parm_draws$draw <- iter_sim_target$draw <- iter_target_dist$draw <- NA_integer_
iter_parm_draws$step <- iter_sim_target$step <- iter_target_dist$step <- NA_integer_
iter_parm_draws$step[1:n_draw] <- iter_sim_target$step[1:n_draw] <- iter_target_dist$step[1:n_draw] <- as.integer(new_steps)
iter_parm_draws[, c(all_parm_names, "scaled_dist", "sample_wt") := NA_real_]
iter_parm_draws$seed <- NA_character_
iter_sim_target[, (sim_target_names) := NA_real_]
iter_target_dist[, c(target_distance_names, "tot_dist") := NA_real_]
iter_target_dist$n_good <- 0L
# New draw sampling
if (current_good_n < N_cov_points) {
# If there are too few good points use independent draws from a truncate norm
# Pull standard deviations of parameters provided with the prior information
prior_sds <- attr(priors, "sds")
sd_next <- matrix(0.5*prior_sds, ncol = n_parms, nrow = num_centers, byrow = TRUE)
colnames(sd_next) <- names(priors)
# Perform random draws
iter_parm_draws[1:n_draw, (all_parm_names) := draw_parms(
n_add = Center_n,
mu = center_next[, all_parm_names],
sigma = sd_next,
priors_list = priors
)]
# Set draw numbers
new_draws <- (total_draws + 1):(total_draws + n_draw)
iter_parm_draws$draw[1:n_draw] <- iter_target_dist$draw[1:n_draw] <- iter_sim_target$draw[1:n_draw] <- new_draws
# Set draw seed
iter_parm_draws$seed[1:n_draw] <- seed_stream(seed_stream_start, n_draw)
seed_stream_start <- as.integer(unlist(strsplit(iter_parm_draws[["seed"]][n_draw], "_")))
# Recalculate center
iter_parm_draws[iter_parm_draws$step == (N_centers + 1), (all_parm_names)] <- as.data.table(center_next)
iter_parm_draws$draw[iter_parm_draws$step == (N_centers + 1) & !is.na(iter_parm_draws$step)] <- center_draw
iter_sim_target$draw[iter_parm_draws$step == (N_centers + 1) & !is.na(iter_parm_draws$step)] <- center_draw
iter_target_dist$draw[iter_parm_draws$step == (N_centers + 1) & !is.na(iter_parm_draws$step)] <- center_draw
# Store results (for mean_cov)
x <- iter_parm_draws[1:n_draw, c("iter", "step"), with = FALSE]
x$in_range <- get_in_range(
compare_list = priors[calibr_parm_names],
check_dt = iter_parm_draws[1:n_draw, calibr_parm_names, with = FALSE],
out = "logical"
)
setkey(x, iter, step)
B_in <- x[step <= N_centers, list(B.in = sum(in_range, na.rm = TRUE)), by = list(iter, step)]
new_mean_cov <- get_mean_cov(
iter = main_loop_iter + 1,
mu = center_next[, calibr_parm_names],
sd = sd_next[, calibr_parm_names],
center = center_draw,
B_in = B_in,
parm_names = calibr_parm_names
)
# If mean_cov has already been created, add new values in. Otherwise create mean_cov with values
if (exists("mean_cov")) {
new_rows <- (nrow(mean_cov) + 1):(nrow(mean_cov) + nrow(new_mean_cov))
mean_cov <- rbind(mean_cov, new_mean_cov)
} else {
mean_cov <- new_mean_cov
new_rows <- 1:nrow(mean_cov)
}
} else { # current_good_n < N_cov_points
# sample MVN points around centers if there are enough points to estimate the cov matrix
sample_mean <- as.data.frame(center_next)
# For tracking which rows need to be written out
new_rows <- c()
for (center_sim_iter in 1:num_centers) {
# center_sim_iter means for calibrated parameters
sample_mean_i1 <- unlist(sample_mean[center_sim_iter, calibr_parm_names])
if (is.null(names(sample_mean_i1))) {
names(sample_mean_i1) <- names(priors[calibr_parm_names])
}
# Rows to be filled for a given center
draw_rows <- ((center_sim_iter - 1)*Center_n + 1):(center_sim_iter*Center_n)
# Calculate parameter distances from given center
prior_sds <- attr(priors[calibr_parm_names], "sds")
good_parm_draws$scaled_dist <- Inf
good_parm_draws$scaled_dist[1:current_good_n] <- total_distance(
dt = good_parm_draws[1:current_good_n, ],
target_names = calibr_parm_names,
mu = sample_mean_i1,
sd = prior_sds,
scale = TRUE
)
# Find nearest points based on scaled distance
calc_cov_points <- 1:max(N_cov_points, trunc(current_good_n/N_centers))
var_data <- good_parm_draws[order(good_parm_draws$scaled_dist)[calc_cov_points], calibr_parm_names, with = FALSE]
# Find center specific var-cov matrices using N_cov_points closest points
sample_cov <- parm_covariance(var_data)
# Deal with parameters that don't have any variance
if (any(diag(sample_cov) <= 0.05*prior_sds)) {
# This occurs when adding a new parameter: it is set to default for all prior draws
is_zero <- (diag(sample_cov) == 0)
sd_next <- 0.05*prior_sds
sd_next[!is_zero] <- 0
diag(sample_cov) <- diag(sample_cov) + (sd_next^2)
}
# Simulate Center_n random draws of calibrated parameters
if (abs(sum(sample_cov) - sum(diag(sample_cov))) < 1e-10) {
# Perform independent draws - uses truncated normal
x <- draw_parms(
n_add = Center_n,
mu = as.matrix(t(sample_mean_i1)),
sigma = as.matrix(t(diag(sample_cov))),
priors_list = priors[calibr_parm_names]
)
} else { # abs(sum(sample_cov) - sum(diag(sample_cov))) < 1e-10
# Perform multivariate draws - uses multivariate normal
x <- get_B_draws(
B = Center_n,
inflate = sample_inflate,
center = sample_mean_i1,
cov_matrix = sample_cov,
priors = priors[calibr_parm_names]
)
} # ! abs(sum(sample_cov) - sum(diag(sample_cov))) < 1e-10
# Store Mean Covariance results
sample_mean_i1 <- setnames(as.data.frame(t(sample_mean_i1)), calibr_parm_names)
sample_cov <- setnames(as.data.frame(sample_cov), calibr_parm_names)
mean_cov_center_i1 <- data.table(rbind(sample_mean_i1, sample_cov))
mean_cov_center_i1$iter <- main_loop_iter + 1
mean_cov_center_i1$step <- center_sim_iter
mean_cov_center_i1$center <- center_draw[center_sim_iter]
mean_cov_center_i1$parm <- 0:(length(calibr_parm_names))
B_in <- sum(get_in_range(
compare_list = priors[calibr_parm_names],
check_dt = x,
out = "logical"
))
mean_cov_center_i1$B.in <- B_in
setcolorder(mean_cov_center_i1, c("iter", "step", "center", "B.in" , "parm", calibr_parm_names))
if (exists("mean_cov")) { # After first iteration or on continuing run
new_rows <- c(new_rows, (nrow(mean_cov) + 1):(nrow(mean_cov) + nrow(mean_cov_center_i1)))
mean_cov <- rbind(mean_cov, mean_cov_center_i1)
} else {
new_rows <- 1:nrow(mean_cov_center_i1)
mean_cov <- mean_cov_center_i1
}
# If no rows were generated, move to next center without saving NULL to the iter_parm_draws
if (B_in == 0) {
warning(sprintf("No valid parameters were simulated for center = %s during iteration %s", center_sim_iter, main_loop_iter))
# Move to the next center
next
}
# Add simulated values of calibrated data to iter_parm_draws
iter_parm_draws[draw_rows, (calibr_parm_names) := x]
# Add fixed values of non calibrated data to iter_parm_draws
non_calibr <- setdiff(all_parm_names, calibr_parm_names)
if (length(non_calibr) > 0) {
iter_parm_draws[draw_rows, ] <- parms_from_priors(
parm_df = iter_parm_draws[draw_rows, ],
name_parms = non_calibr,
prior_list = priors, sampling = NULL
)
}
} # center_sim_iter in 1:num_centers
# See how many new valid parameters have been generated
keep_rows <- get_in_range(
compare_list = priors,
check_dt = iter_parm_draws,
out = "logical"
)
if (sum(keep_rows) == 0) {
warning(sprintf("Unable to generate new parameters for any center at iteration %s. Ending run.", main_loop_iter))
break
}
# Update iteration parm draws with which newly generated rows should be kept for next iteration
# Reset step column to NA for non-valid rows
iter_parm_draws$step[!keep_rows] <- iter_target_dist$step[!keep_rows] <- iter_sim_target$step[!keep_rows] <- NA
# Set draw numbers
n_draw <- sum(keep_rows)
new_draws <- (total_draws + 1):(total_draws + n_draw)
iter_parm_draws$draw[keep_rows] <- iter_target_dist$draw[keep_rows] <- iter_sim_target$draw[keep_rows] <- new_draws
# reorder data.tables so NAs are on bottom
setorder(iter_parm_draws, draw, iter, step, na.last = TRUE)
setorder(iter_sim_target, draw, iter, step, na.last = TRUE)
setorder(iter_target_dist, draw, iter, step, na.last = TRUE)
# Set draw seed
iter_parm_draws$seed[1:n_draw] <- seed_stream(seed_stream_start, n_draw)
seed_stream_start <- as.integer(unlist(strsplit(iter_parm_draws[["seed"]][n_draw], "_")))
# Recalculate centers
center_rows <- (n_draw + 1):(n_draw + N_centers)
iter_parm_draws[center_rows, (all_parm_names)] <- as.data.table(center_next)
iter_parm_draws[center_rows, ]$draw <- center_draw
iter_sim_target[center_rows, ]$draw <- center_draw
iter_target_dist[center_rows, ]$draw <- center_draw
iter_parm_draws[center_rows, ]$step <- N_centers + 1
iter_sim_target[center_rows, ]$step <- N_centers + 1
iter_target_dist[center_rows, ]$step <- N_centers + 1
iter_parm_draws[center_rows, ]$seed <- seed_stream(seed_stream_start, N_centers)
seed_stream_start <- as.integer(unlist(strsplit(iter_parm_draws[["seed"]][center_rows[N_centers]], "_")))
n_draw <- n_draw + N_centers
# put good_* data tables in draw order
setorder(good_parm_draws, draw, na.last = TRUE)
setorder(good_sim_target, draw, na.last = TRUE)
setorder(good_target_dist, draw, na.last = TRUE)
} # ! current_good_n < N_cov_points
} # if (main_loop_iter < end_iter)
# # Save iteration results
# if (!is.null(output_directory)) {
# if (main_loop_iter < end_iter || main_loop_iter == 1) {
# if (append_meancov_outfile) {
# save_rows <- new_rows
# } else {
# save_rows <- 1:nrow(mean_cov)
# }
# save_results(
# mean_cov[save_rows, c("iter", "step", "center", "B.in", "parm", calibr_parm_names), with = FALSE], meancov_outfile,
# out_dir = output_directory, append = append_meancov_outfile
# )
# append_meancov_outfile <- TRUE
# }
# # Pull complete options except for
# metaddata <- data.frame(
# info = c("current_iteration", "last_draw"),
# value = c(main_loop_iter, total_draws)
# )
# # Pull function inputs (except for target_fun, priors, and targets)
# fn_ops <- mget(names(formals()), sys.frame(sys.nframe()))
# PickOut <- unlist(lapply(fn_ops, function(x) !(is.function(x) | is.list(x))))
# fn_ops <- unlist(fn_ops[PickOut])
# fn_ops <- data.frame(info = names(fn_ops), value = fn_ops, row.names = NULL)
# metaddata <- rbind(metaddata, fn_ops)
# if (validate_run) {
# save_good_parm_draws <- copy(good_parm_draws)[, scaled_dist := NULL, ]
# save_good_parm_draws <- save_good_parm_draws[!is.na(draw), ]
# save_good_parm_draws$actual_iter <- main_loop_iter
# save_good_sim_target <- copy(good_sim_target)
# save_good_sim_target <- save_good_sim_target[!is.na(draw), ]
# save_good_sim_target$actual_iter <- main_loop_iter
# save_good_target_dist <- copy(good_target_dist)
# save_good_target_dist <- save_good_target_dist[!is.na(draw), ]
# save_good_target_dist$actual_iter <- main_loop_iter
# } else {
# save_good_parm_draws <- copy(good_parm_draws)[, scaled_dist:= NULL, ]
# save_good_sim_target <- good_sim_target
# save_good_target_dist <- good_target_dist
# }
# # Write out files
# save_results(
# list(metaddata, runmeta_df_outfile),
# list(as.data.frame(targets), targlist_df_outfile),
# out_dir = output_directory, append = FALSE
# )
# save_results(
# list(save_good_parm_draws, parm_df_outfile),
# list(save_good_sim_target, simtarg_df_outfile),
# list(save_good_target_dist, targdist_df_outfile),
# out_dir = output_directory, append = append_iter_outfile # FALSE
# )
# if (validate_run) {
# append_iter_outfile <- TRUE
# }
# } # Save iteration results
if (verbose & !is_first_continue_iter) { cat(sprintf("----------- End Iter %s -----------\n", main_loop_iter)) }
} # main_loop_iter in start_iter:end_iter
# Prep final results
# Valid Parameters
good_parm_draws <- good_parm_draws[!is.na(draw), ]
setorder(good_parm_draws, draw, na.last = TRUE)
# Valid Targets
good_sim_target <- good_sim_target[!is.na(draw), ]
setorder(good_sim_target, draw, na.last = TRUE)
# Valid Distances
good_target_dist <- good_target_dist[!is.na(draw), ]
good_target_dist[, (target_distance_names) := lapply(.SD , "abs"), .SDcols = target_distance_names]
setorder(good_target_dist, draw, na.last = TRUE)
# Save current iteration and last draw info for restart
metaddata <- data.frame(
info = c("current_iteration", "last_draw"),
value = c(main_loop_iter, total_draws)
)
# Pull function inputs (except for target_fun, priors, and targets)
fn_ops <- mget(names(formals()), sys.frame(sys.nframe()))
PickOut <- unlist(lapply(fn_ops, function(x) !(is.function(x) | is.list(x))))
fn_ops <- unlist(fn_ops[PickOut])
fn_ops <- data.frame(info = names(fn_ops), value = fn_ops, row.names = NULL)
metaddata <- rbind(metaddata, fn_ops)
# Save
if (!is.null(output_directory)) {
save_results(
list(metaddata, runmeta_df_outfile),
list(as.data.frame(targets), targlist_df_outfile),
out_dir = output_directory, append = FALSE
)
if (validate_run) {
save_good_parm_draws <- copy(good_parm_draws)[, scaled_dist := NULL, ]
save_good_parm_draws <- save_good_parm_draws[!is.na(draw), ]
save_good_parm_draws$actual_iter <- main_loop_iter
save_good_sim_target <- copy(good_sim_target)
save_good_sim_target <- save_good_sim_target[!is.na(draw), ]
save_good_sim_target$actual_iter <- main_loop_iter
save_good_target_dist <- copy(good_target_dist)
save_good_target_dist <- save_good_target_dist[!is.na(draw), ]
save_good_target_dist$actual_iter <- main_loop_iter
} else {
save_good_parm_draws <- copy(good_parm_draws)[, scaled_dist:= NULL, ]
save_good_sim_target <- good_sim_target
save_good_target_dist <- good_target_dist
}
save_results(
list(save_good_parm_draws, parm_df_outfile),
list(save_good_sim_target, simtarg_df_outfile),
list(save_good_target_dist, targdist_df_outfile),
out_dir = output_directory, append = append_iter_outfile
)
}
# Print information
done_timestamp <- Sys.time()
calc_time <- done_timestamp - run_timestamp
if (verbose) {
header <- "\n---------- Run Completed ----------"
time_info <- sprintf("Finishing ABC at %s", format(done_timestamp, print_time_fmt))
duration_info <- sprintf("Calculation took %s %s", round(calc_time, 2), attr(calc_time, "units"))
seed_info <- sprintf("seed value is: %s", paste0(.Random.seed, collapse = ", "))
n_info <- sprintf("Number of in range draws was %s", current_good_n)
bound_info <- lapply(attr(targets, "target_names"), FUN = function(x, targ) {
tmp <- targ[x]
paste(sprintf("%s: %s - %s", x, tmp$current_lower_bounds, tmp$current_upper_bounds), collapse = "\n")
}, targ = targets)
bound_info <- lapply(unique(attr(targets, "target_groups")), FUN = function(x, groups, bound_info) {
paste(unlist(bound_info[x == groups]), collapse = "\n")
}, groups = attr(targets, "target_groups"), bound_info = bound_info)
bound_info <- paste(sprintf("---- %s ----", unique(attr(targets, "target_groups"))), bound_info, sep = "\n", collapse = "\n")
cat(header, time_info, duration_info, seed_info, n_info, "Final bounds info:", bound_info, sep = "\n")
}
# If user wants all intermediate results saved
if (validate_run) {
out_list <- list(
all_iter_parm_draws = all_iter_parm_draws[, scaled_dist:= NULL],
all_iter_sim_target = all_iter_sim_target,
all_iter_target_dist = all_iter_target_dist,
good_parm_draws = good_parm_draws[, scaled_dist:= NULL],
good_sim_target = good_sim_target,
good_target_dist = good_target_dist,
mean_cov = mean_cov,
priors = priors,
targets = targets,
metaddata = metaddata
)
} else {
out_list <- list(
good_parm_draws = good_parm_draws[, scaled_dist:= NULL],
good_sim_target = good_sim_target,
good_target_dist = good_target_dist,
mean_cov = mean_cov,
priors = priors,
targets = targets,
metaddata = metaddata
)
}
return(out_list)
}
#########################################################################################################################
#########################################################################################################################
#########################################################################################################################
#########################################################################################################################
#########################################################################################################################
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.