R/imabc.R

Defines functions imabc

Documented in imabc

#' @title Incremental Mixture Approximate Bayesian Computation (IMABC)
#'
#' @description Calibrates a model using the IMABC algorithm.
#'
#' @param target_fun A function that generate target values given parameters  (i.e., `the model'). The use of
#' define_target_function is stronlgy advised to ensure that the function takes in the correct values and
#' correctly returns results.
#' @param priors A priors object created using define_priors. This contains information regarding the parameters that
#' are being calibrated. Is ignored if starting from previous results.
#' @param targets A targets object created using define_targets. This contains information regarding the target values
#' which will be used to evaluate simulated parameters. Is ignored if starting from previous results.
#' @param N_start numeric(1). The number of draws to simulate for the first iteration.
#' @param N_centers numeric(1). The number of centers to use for exploring the parameter space.
#' @param Center_n numeric(1). The number of points to add around each center
#' @param N_cov_points numeric(1). The minimum number of points used to estimate the covariance matrix of valid parameters
#' nearest each center point. The covariance matrix is used when simulating new parameter draws around the center. If 0
#' (default), uses 25*number of parameters.
#' @param N_post numeric(1). The weighted sample size that must be achieved using valid parameter values in order to stop
#' algorithm.
#' @param sample_inflate numeric(1). When generating new results for a given center, how many additional samples should be
#' simulated to ensure enough valid (within range) parameters draws are simulated for the center.
#' @param max_iter numeric(1). The maximum number of iterations to attempt.
#' @param seed numeric(1). The seed to set for reproducibility.
#' @param latinHypercube logical(1). Should algorithm use a Latin Hypercube to generate first set of parameters.
#' @param backend_fun function. For advanced users only. Lets to user evaluate the target function(s) using their own
#' backend, i.e., simulate targets with an alternative parallel method. Only necessary if the backend method is
#' not compatible with foreach. See details for requirements.
#' @param output_directory character(1). Path to save results to. If NULL (default), no results are saved. If a path is
#' provided results are saved/updated every iteration. See details for more information.
#' @param output_tag character(1). Tag to add to result files names. "timestamp" (default) is a special code that adds
#' the time and date the code was executed.
#' @param previous_results_dir Optional character(1). Path to results stored during a previous run. If the user wishes to
#' restart a run that didn't complete the calibration, they can continue by using the outputs stored during the previous
#' run.
#' @param previous_results_tag Optional character(1). The tag that was added to the previous run output files.
#' @param verbose logical(1). Prints out progress messages and additional information as the model works.
#' @param validate_run logical(1). If this is TRUE and an output_directory is specified, the function will save all
#' parameters generated by the model - even ones that were deemed invalid based on their simulated targets.
#'
#' @details
#' The user specifies the calibrated parameters, their prior distributions, calibration targets with initial and
#' final acceptance intervals, and the function (i.e., the  model) used to generate targets given calibrated parameters
#' The algorithm begins by drawing a sample of vectors from the parameter space based on prior distributions.
#' This initial sample can be drawn using a Latin hypercube. The algorithm identifies and retains parameter
#' vectors that result in generated targets that are within the current acceptance intervals. The algorithm iteratively
#' updates this sample and narrows the acceptance intervals until either 1) the algorithm reaches the final
#' acceptance intervals around each target and identifies the requested sample of parameter vectors that generate
#' targets within these acceptance intervals, or the algorithm completes the maximum number of iterations.
#' The algorithm can be restarted to continue iterating.
#'
#' A technical description of the imabc algorithm is provided in
#' Rutter CM, Ozik J, DeYoreo M, Collier N. Microsimulation model calibration using incremental mixture approximate
#' Bayesian computation. Ann. Appl. Stat. 13 (2019), no. 4, 2189-2212. doi:10.1214/19-AOAS1279.
#' https://projecteuclid.org/euclid.aoas/1574910041.
#'
#' The imabc package implements a small modification to the approach described in the 2019 AOAS paper. In the imabc package,
#' the user specifies initial and final acceptance intervals directly. This approach is more flexible than the
#' approach described in the paper and more easily incorporates asymmetric acceptance intervals.
#'
#' ## N_cov_points relation to the number of parameters:
#' When the algorithm has enough quality draws, it estimates the covariance between parameters and uses these relations
#' in order to improve future simulations of parameters. However, this can only work if the covariance matrix is not
#' singular. When a covariance matrix is singular, imabc will replace it with an independent covariance matrix (a diagonal
#' matrix of the variances of the parameters) to avoid any calculation errors. Setting N_cov_points to be less than the
#' number of parameters will lead to singularness in a covariance matrix. The algorithm can still run but will be not as
#' efficient or may not be able to calibrate completely.
#'
#' ## Custom Backend Function:
#' The primary run handler takes each row from the simulated draws and provides the appropriate information to the
#' target_fun function as inputs. This includes pulling the parameter values as a named vector, pulling a unique seed
#' generated for each set of parameters, as well as passing the current priors and targets objects. This is done using
#' the foreach function from the foreach package. This allows the user to register their own preferred parallel backend
#' before running the imabc function so long as it is compatible with foreach. If the user does not provide a parallel
#' backend, foreach will run the analysis in sequence by default and provide a warning indicating such the first time
#' the imabc function is run within a session.
#'
#' However, since not all parallel backends are compatible with this method, we have provided a way for the user to add
#' their own run handling method. To utilize this feature, the user must create a function that meets a couple requirements
#' in order to work properly.
#'
#' The first requirement is that the backend function have inputs in the following order:
#' the data.table of all parameters to be evaluated, the names of all the parameters being calibrated, the target function
#' to be used for evaluating parameters, a list that includes the priors object and the targets object. The user can name
#' these inputs whatever they prefer but the correct order and number of inputs will be expected (i.e. the user must create
#' a function with four inputs, the first will be the parameter data.table, and so on.). The user can utilize any piece
#' of info passed to these parameters as well. This includes unique seed values passed as a column of the parameter data.table
#' (called "seed"), and the current targets and priors objects passed in the fourth input. The priors and targets objects
#' are named priors and targets respectively in the fourth input list.
#'
#' The last requirement is that the returned object be a data.table of simulated target values. Each row represents a set
#' of results from the target_fun for a given set of parameters and each column represents a target value based on the
#' targets object. If the final output of the custom backend returns a data.table with column names identical to the
#' target names, the order of the columns will be verified by imabc. If the final output of the backend does not include
#' column names that match the target names, the user must ensure that they are in the same order as the targets object.
#' If they are not in the appropriate order, information may be attached to the wrong target and lead to errors.
#'
#' Do not use the custom backend unless you are confident you understand what is expected of the run handler. To get a
#' better understanding of what is being done run View(imabc:::run_handler) in the console to see how the backend_fun is
#' being used.
#'
#' ## Output Files:
#' If an output directory is specified files are saved for each of the objects returned by the function. They are named
#'   as follows:
#' * Good_SimulatedParameters_tag.csv = good_parm_draws
#' * Good_SimulatedTargets_tag.csv    = good_sim_target
#' * Good_SimulatedDistances_tag.csv  = good_target_dist
#' * MeanCovariance_tag.csv           = mean_cov
#' * CurrentPriors_tag.csv            = priors
#' * CurrentTargets_tag.csv           = targets
#' * RunMetadata_tag.csv              = metaddata
#'
#' if validate_run = TRUE, includes:
#' * SimulatedParameters_tag.csv      = all_iter_parm_draws
#' * SimulatedTargets_tag.csv         = all_iter_sim_target
#' * SimulatedDistances_tags.csv      = all_iter_target_dist
#' @md
#'
#' @return A list with:
#'
#' * good_parm_draws - a data.table of valid parameters for the current target bounds
#' * good_sim_target - a data.table of simulated target results from good_parm_draws parameters
#' * good_target_dist - a data.table of distances based on simulated good target results
#' * mean_cov - a data.frame of the means and covariances of parameters for iterations that had more good parameters than
#'   N_cov_points
#' * priors - The prior object with empirical standard deviation from first N_start generated values
#' * targets - The target object with updated bounds based on calibration
#' * metaddata - Important info regarding the function inputs and current set of results including current_iteration (the
#'   last iteration that completed) and last_draw (the total number of draws simulated during execution)
#'
#' if validate_run = TRUE, includes:
#' * all_iter_parm_draws - all parameters generated by the algorithm, even ones that results in target values outside of
#'   the current target bounds
#' * all_iter_sim_target - all simulated target values from the parameters in all_iter_parm_draws
#' * all_iter_target_dist - all distances based on simulated target results
#'
#' @import MASS data.table foreach parallel truncnorm lhs
#' @importFrom methods formalArgs
#' @importFrom stats cov dnorm runif sd
#' @importFrom utils flush.console read.csv read.table setTxtProgressBar txtProgressBar write.table
#' @export
imabc <- function(
  target_fun,
  priors = NULL,
  targets = NULL,
  N_start = 1,
  N_centers = 1,
  Center_n = 50,
  N_cov_points = 0,
  N_post = 100,
  sample_inflate = 1.5,
  max_iter = 1000,
  seed = NULL,
  latinHypercube = TRUE,
  backend_fun = NULL,
  output_directory = NULL,
  output_tag = "timestamp",
  previous_results_dir = NULL,
  previous_results_tag = NULL,
  verbose = TRUE,
  validate_run = TRUE
) {
  # Environment setup ---------------------------------------------------------------------------------------------------
  # For CRAN checks
  sample_wt <- tot_dist <- step <- iter <- n_good <- scaled_dist <- draw <- NULL

  # Check for priors and targets or previous_results directory
  if ((is.null(priors) || is.null(targets)) & is.null(previous_results_dir)) {
    stop("Must provide both a priors object and a targets object or provide a path to previous results.")
  }
  # If path to previous results are provided, load the results into environment
  if (!is.null(previous_results_dir)) {
    # Warn about using priors/targets from previous results
    if (!is.null(priors)) {
      warning("priors object provided is ignored in favor of priors object derived from previous results.")
    }
    if (!is.null(targets)) {
      warning("targets object provided is ignored in favor of targets object derived from previous results.")
    }
    # Read in previous results
    last_run <- read_previous_results(path = previous_results_dir, tag = previous_results_tag)
    priors <- last_run$new_priors
    targets <- last_run$new_targets
    previous_results <- last_run$previous_results
    continue_runs <- TRUE
    rm(last_run)

    # Check format of data.tables loaded against objects loaded
    check_format(
      imabc_obj = priors,
      tracking_dt = previous_results$good_parm_draws
    )
    check_format(
      imabc_obj = targets,
      tracking_dt = previous_results$good_sim_target,
      dist_dt = previous_results$good_target_dist
    )

    # Previous Run info
    previous_iter <- as.numeric(previous_results$prev_run_meta$value[previous_results$prev_run_meta$info == "current_iteration"])
    previous_draw <- as.numeric(previous_results$prev_run_meta$value[previous_results$prev_run_meta$info == "last_draw"])

    # Ensure mean_cov has proper number of iterations
    if (any(previous_results$mean_cov$iter > previous_iter)) {
      previous_results$mean_cov <- previous_results$mean_cov[previous_results$mean_cov$iter <= previous_iter, ]
    }
    # Make sure mean and covariance data is in proper format
    mean_cov <- data.table(previous_results$mean_cov)

  } else {
    previous_results <- NULL
    continue_runs <- FALSE
  }

  # Technical run specifics
  run_timestamp <- Sys.time()
  print_time_fmt <- "%a %b %d %X %Y"
  write_time_fmt <- "%Y%m%d_%H%M%Z"
  append_meancov_outfile <- FALSE
  append_iter_outfile <- FALSE
  start_iter <- ifelse(continue_runs, previous_iter, 1)
  end_iter <- (start_iter - !continue_runs) + max_iter

  # Data size specifics
  total_draws <- ifelse(continue_runs, previous_draw, 0)
  n_draw <- ifelse(continue_runs, N_centers*Center_n, N_start)
  n_rows_init <- max(n_draw, N_centers*Center_n) + N_centers
  n_store <- max((N_post + N_centers*(Center_n + 1)), 2*N_cov_points + 1)
  ESS <- 0 # Effective Sample Size

  # Create output directory if it doesn't already exist
  if (!is.null(output_directory)) {
    dir.create(output_directory, showWarnings = FALSE, recursive = TRUE)
  }

  # Randomization method (must be L'Ecuyer-CMRG for parallelization)
  og_kind <- RNGkind()
  RNGkind(kind = "L'Ecuyer-CMRG")
  on.exit(RNGkind(kind = og_kind[1], normal.kind = og_kind[2], sample.kind = og_kind[3]), add = TRUE)
  set.seed(seed)
  seed_stream_start <- .Random.seed

  # File names for saving
  output_tag <- ifelse(output_tag == "timestamp", format(run_timestamp, write_time_fmt), output_tag)
  targlist_df_outfile <- ifelse(is.null(output_tag), "CurrentTargets.csv", sprintf("CurrentTargets_%s.csv", output_tag))
  priolist_df_outfile <- ifelse(is.null(output_tag), "CurrentPriors.csv", sprintf("CurrentPriors_%s.csv", output_tag))
  parm_df_outfile <- ifelse(is.null(output_tag), "Good_SimulatedParameters.csv", sprintf("Good_SimulatedParameters_%s.csv", output_tag))
  simtarg_df_outfile <- ifelse(is.null(output_tag), "Good_SimulatedTargets.csv", sprintf("Good_SimulatedTargets_%s.csv", output_tag))
  targdist_df_outfile <- ifelse(is.null(output_tag), "Good_SimulatedDistances.csv", sprintf("Good_SimulatedDistances_%s.csv", output_tag))
  meancov_outfile <- ifelse(is.null(output_tag), "MeanCovariance.csv", sprintf("MeanCovariance_%s.csv", output_tag))
  runmeta_df_outfile <- ifelse(is.null(output_tag), "RunMetadata.csv", sprintf("RunMetadata_%s.csv", output_tag))

  # Print ---------------------------------------------------------------------------------------------------------------
  if (verbose) {
    timeinfo <- sprintf("----- %s -----", format(run_timestamp, print_time_fmt))
    headerinfo <- ifelse(continue_runs, "Continuing IMABC with specifications:", "New IMABC with specifications:")
    iterinfo <- sprintf("Max iterations                           = %s", max_iter)
    nstartinfo <- sprintf("Starting simulation size                 = %s", N_start)
    ncentinfo <- sprintf("Number of centers of mass                = %s", N_centers)
    centninfo <- sprintf("Number of samples around each center     = %s", Center_n)
    npostinfo <- sprintf("Minimum valid samples to end calibration = %s", N_post)
    cat(sep = "\n", timeinfo, headerinfo, iterinfo, nstartinfo, ncentinfo, centninfo, npostinfo)
  }

  # Distances Handling --------------------------------------------------------------------------------------------------
  # All the distances that are being calculated (names of non-grouped targets + names of groups of targets)
  target_distance_names <- unique(attr(targets, "target_groups"))
  n_target_distances <- length(target_distance_names)

  # Number of draws in range of targets from previous run. 0 if no previous run was supplied
  current_good_n <- ifelse(continue_runs, sum(previous_results$good_target_dist$n_good == n_target_distances, na.rm = TRUE), 0)

  # Initialize current iteration distance data.table
  iter_target_dist <- init_iter_dt(n_row = n_rows_init, cols = target_distance_names, type = "targ_dists")
  iter_target_dist[draw > n_draw, draw := NA]

  # Initialize good distance data.table or add rows to previous good results
  good_target_dist <- init_good_dt(
    final_n = n_store, current_n = current_good_n, previous_dt = previous_results$good_target_dist,
    cols = target_distance_names, type = "targ_dists"
  )

  # Targets Handling ----------------------------------------------------------------------------------------------------
  # All the targets that are being simulated
  sim_target_names <- attr(targets, "target_names")

  # Initialize current iteration simulated target data.table
  iter_sim_target <- init_iter_dt(n_row = n_rows_init, cols = sim_target_names, type = "sim_targs")
  iter_sim_target[draw > n_draw, draw := NA]

  # Initialize good distance data.table or add rows to previous good results
  good_sim_target <- init_good_dt(
    final_n = n_store, current_n = current_good_n, previous_dt = previous_results$good_sim_target,
    cols = sim_target_names, type = "sim_targs"
  )

  # Parameter Handling --------------------------------------------------------------------------------------------------
  # All parameters used for calculating targets
  all_parm_names <- names(priors)
  calibr_parm_names <- all_parm_names[attr(priors, "distributions") != "fixed"]
  n_parms <- length(all_parm_names)
  if (N_cov_points > 0 & N_cov_points < n_parms) {
    warning("Setting N_cov_points < the number of parameters is not advised.\nSee `?imabc` Details for more info.")
  }
  if (N_cov_points == 0) { N_cov_points <- 25*n_parms }

  # Initialize current iteration simulated parameters data.table
  iter_parm_draws <- init_iter_dt(n_row = n_rows_init, cols = all_parm_names, type = "parm_draws")
  iter_parm_draws[draw > n_draw, draw := NA]

  # Initialize good distance data.table or add rows to previous good results
  good_parm_draws <- init_good_dt(
    final_n = n_store, current_n = current_good_n, previous_dt = previous_results$good_parm_draws,
    cols = all_parm_names, type = "parm_draws"
  )

  # Generate first set of simulated parameters
  iter_parm_draws$seed[1:n_draw] <- seed_stream(seed_stream_start, n_draw)
  seed_stream_start <- as.integer(unlist(strsplit(iter_parm_draws[["seed"]][n_draw], "_")))
  if (!continue_runs) {
    # Generate random inputs for prior distribution calculation
    if (latinHypercube) {
      u_draws <- lhs::randomLHS(N_start, n_parms)
    } else {
      u_draws <- matrix(runif(N_start*n_parms), nrow = N_start, ncol = n_parms)
    }
    colnames(u_draws) <- all_parm_names

    # Generate parameter space from prior distribution functions
    iter_parm_draws <- parms_from_priors(
      parm_df = iter_parm_draws,
      name_parms = all_parm_names,
      prior_list = priors,
      sampling = u_draws
    )
    # Get empirical standard deviation of parameters
    priors <- update_parm_sds(priors, dt = iter_parm_draws, parms = all_parm_names)
  } # !continue_runs
  # Save results
  if (!is.null(output_directory)) {
    save_results(list(as.data.frame(priors), priolist_df_outfile), out_dir = output_directory, append = FALSE)
  }

  # Print ---------------------------------------------------------------------------------------------------------------
  if (verbose) {
    bound_info <- lapply(attr(targets, "target_names"), FUN = function(x, targ) {
      tmp <- targ[x]
      paste(sprintf("%s: %s - %s", x, tmp$current_lower_bounds, tmp$current_upper_bounds), collapse = "\n")
    }, targ = targets)
    bound_info <- lapply(unique(attr(targets, "target_groups")), FUN = function(x, groups, bound_info) {
      paste(unlist(bound_info[x == groups]), collapse = "\n")
    }, groups = attr(targets, "target_groups"), bound_info = bound_info)
    bound_info <- paste(sprintf("---- %s ----", unique(attr(targets, "target_groups"))), bound_info, sep = "\n", collapse = "\n")
    cat("Current bound info:", bound_info, sep = "\n")
  }

  # Main Loop -----------------------------------------------------------------------------------------------------------
  for (main_loop_iter in start_iter:end_iter) {
    # valid draws generated from points simulated in the iteration
    iter_valid_n <- 0
    # What targets are left to update based on stopping bounds
    update_targets <- unique(attr(targets, "update"))
    # Continuation run first iteration flag
    is_first_continue_iter <- continue_runs == TRUE & main_loop_iter == start_iter

    # Print -------------------------------------------------------------------------------------------------------------
    if (verbose & !is_first_continue_iter) {
      header <- sprintf("\n---------- Start Iter %s ----------", main_loop_iter)
      iter_info <- sprintf("Starting at: %s", Sys.time())
      n_info <- sprintf("Current in range draws = %s", current_good_n)
      if (length(update_targets) > 0) {
        n_info <- sprintf("New in range draws = %s", current_good_n)
        bounds_to_print <- names(attr(targets, "update"))
        bound_info <- lapply(attr(targets,"target_names"), FUN = function(x, targ, to_print) {
          if (x %in% to_print) {
            tmp <- targ[x]
            y <- paste(sprintf("%s: %s - %s", x, tmp$current_lower_bounds, tmp$current_upper_bounds), collapse = "\n")
          } else {
            y <- paste(sprintf("%s: Reached stopping bounds", x), collapse = "\n")
          }
          return(y)
        }, targ = targets, to_print = bounds_to_print)
        bound_info <- lapply(unique(attr(targets, "target_groups")), FUN = function(x, groups, bound_info) {
          paste(unlist(bound_info[x == groups]), collapse = "\n")
        }, groups = attr(targets, "target_groups"), bound_info = bound_info)
        bound_info <- paste(sprintf("---- %s ----", unique(attr(targets, "target_groups"))), bound_info, sep = "\n", collapse = "\n")
        bound_info <- ifelse(bound_info == "", "All tolerance intervals at target bounds", bound_info)
        cat(header, iter_info, n_info, "Current bound info:", bound_info, sep = "\n")
      } else {
        cat(header, iter_info, "All tolerance intervals at target bounds", sep = "\n")
      }
    }

    # Evaluate Draws ----------------------------------------------------------------------------------------------------
    # If not a continuing runs first iteration
    if (!is_first_continue_iter) {
      # Evaluate simulated draws using target function(s)
      parms_to_run <- iter_parm_draws[1:n_draw]
      iter_sim_results <- run_handler(
        parms_to_run = parms_to_run, target_fun = target_fun, all_parm_names = all_parm_names,
        targets = targets, priors = priors, verbose = verbose, custom_function = backend_fun
      )

      # Ensure order of columns matches order of sim_target_names if names exist
      if (all(names(iter_sim_results) %in% colnames(iter_sim_target))) {
        iter_sim_results <- iter_sim_results[, match(sim_target_names, colnames(iter_sim_results))]
      }

      # Store results
      total_draws <- total_draws + n_draw - sum(iter_parm_draws$step %in% (N_centers + 1))
      iter_sim_target[1:n_draw, (sim_target_names) := iter_sim_results]
      # Evaluate Targets, first based on whether they are in the appropriate range, second by euclidian distance
      iter_target_dist[, (target_distance_names) := eval_targets(sim_targets = iter_sim_target, target_list = targets, criteria = "start")]
      iter_target_dist$tot_dist <- total_distance(dt = iter_target_dist, target_names = target_distance_names, scale = FALSE)
      # Count the good points (points associated with positive distances)
      iter_target_dist$n_good[1:n_draw] <- rowSums(iter_target_dist[1:n_draw, (target_distance_names), with = FALSE] >= 0, na.rm = TRUE)
      iter_valid_n <- sum(iter_target_dist$n_good[iter_target_dist$step <= N_centers] == n_target_distances, na.rm = TRUE)

      # If user wants to explore results of all simulated parameters
      if (validate_run) {
        if (!append_iter_outfile) {
          all_iter_parm_draws <- iter_parm_draws[!is.na(iter_parm_draws$draw), ]
          all_iter_sim_target <- iter_sim_target[!is.na(iter_sim_target$draw), ]
          all_iter_target_dist <- iter_target_dist[!is.na(iter_target_dist$draw), ]
        } else {
          all_iter_parm_draws <- rbind(all_iter_parm_draws, iter_parm_draws[!is.na(iter_parm_draws$draw), ])
          all_iter_sim_target <- rbind(all_iter_sim_target, iter_sim_target[!is.na(iter_sim_target$draw), ])
          all_iter_target_dist <- rbind(all_iter_target_dist, iter_target_dist[!is.na(iter_target_dist$draw), ])
        }
        if (!is.null(output_directory)) {
          save_results(
            list(iter_parm_draws[!is.na(iter_parm_draws$draw), ][, scaled_dist:= NULL], gsub("Good_", "", parm_df_outfile)),
            list(iter_sim_target[!is.na(iter_sim_target$draw), ], gsub("Good_", "", simtarg_df_outfile)),
            list(iter_target_dist[!is.na(iter_target_dist$draw), ], gsub("Good_", "", targdist_df_outfile)),
            out_dir = output_directory, append = append_iter_outfile
          )
        }
        # append_iter_outfile <- TRUE
      }

      # Update centers in good_* which have been re-simulated since last iteration
      if (main_loop_iter > 1) {
        # What were the centers from last time
        center_draw <- iter_parm_draws$draw[iter_parm_draws$step == (N_centers + 1) & !is.na(iter_parm_draws$step)]

        # Sort to ensure alignment across all data tables
        center_draw <- sort(center_draw, na.last = TRUE)
        setorder(iter_parm_draws, draw, na.last = TRUE)
        setorder(good_parm_draws, draw, na.last = TRUE)
        setorder(iter_sim_target, draw, na.last = TRUE)
        setorder(good_sim_target, draw, na.last = TRUE)
        setorder(iter_target_dist, draw, na.last = TRUE)
        setorder(good_target_dist, draw, na.last = TRUE)

        # Replace old center values with current iteration centers
        good_parm_draws[draw %in% center_draw, c("iter", "draw", "step", "seed")] <-
          iter_parm_draws[draw %in% center_draw, c("iter", "draw", "step", "seed"), with = FALSE]
        good_sim_target[draw %in% center_draw, c("iter", "draw", "step", sim_target_names)] <-
          iter_sim_target[draw %in% center_draw, c("iter", "draw", "step", sim_target_names), with = FALSE]
        good_target_dist[draw %in% center_draw, c("iter", "draw", "step", target_distance_names, "n_good")] <-
          iter_target_dist[draw %in% center_draw, c("iter", "draw", "step", target_distance_names, "n_good"), with = FALSE]
        good_target_dist$n_good[is.na(good_target_dist$n_good)] <- 0L

        # Determine which center draws are still valid (to keep) and which aren't (to remove)
        remove_draws <- good_target_dist[((draw %in% center_draw) & (n_good < n_target_distances)), ]$draw
        keep_draws <- good_target_dist[((draw %in% center_draw) & (n_good == n_target_distances)), ]$draw

        # Update good draw distance and stopping criteria
        if (length(keep_draws) > 0) {
          # if there are any centers that are kept, recalculate
          if (length(update_targets) == 0) {
            good_target_dist[draw %in% keep_draws, ]$tot_dist <- total_distance(
              dt = good_target_dist[draw %in% keep_draws, ],
              target_names = target_distance_names,
              scale = FALSE
            )

          } else { # length(update_targets) == 0
            good_target_dist[draw %in% keep_draws, ]$tot_dist <- total_distance(
              dt = good_target_dist[draw %in% keep_draws, ],
              target_names = update_targets,
              scale = FALSE
            )

          } # ! length(update_targets) == 0
        } # length(keep_draws) > 0

        # Remove bad draws
        if (length(remove_draws) > 0) {
          # Print information
          if (verbose) {
            cat("Removing centers as good draws:", paste(remove_draws, collapse = ", "), "\n")
          }

          # Remove bad draws
          good_parm_draws[draw %in% remove_draws, names(good_parm_draws) := NA]
          good_sim_target[draw %in% remove_draws, names(good_sim_target) := NA]
          good_target_dist[draw %in% remove_draws, names(good_target_dist) := NA]

          # place removed draws at the bottom
          setorder(good_parm_draws, draw, iter, step, na.last = TRUE)
          setorder(good_sim_target, draw, iter, step, na.last = TRUE)
          setorder(good_target_dist, draw, iter, step, na.last = TRUE)
        } # length(remove_draws) > 0

        # After replacing recalculated centers in good_* data tables, remove them from the iteration-specific data tables
        iter_parm_draws[step == (N_centers + 1), (all_parm_names) := NA_real_]
        iter_sim_target[step == (N_centers + 1), (sim_target_names) := NA_real_]
        iter_target_dist[step == (N_centers + 1), c(target_distance_names, "tot_dist") := NA_real_]
        iter_target_dist[step == (N_centers + 1), n_good := 0L]

        # Total draws to be kept
        current_good_n <- sum(good_target_dist$n_good == n_target_distances, na.rm = TRUE)
      } # if (main_loop_iter > 1)

      # Stop if there are no close points (and so cannot continue)
      stopifnot("No valid parameters to work from." = current_good_n + iter_valid_n > 0)

      # Save good draws, targets, and distances if any were found
      if (iter_valid_n > 0) {
        # Determine rows that will be kept in good_*
        n_keep <- ifelse(main_loop_iter != 1 & (current_good_n + iter_valid_n) > n_store, n_store - iter_valid_n, current_good_n)
        if (main_loop_iter != 1 & (current_good_n + iter_valid_n) > n_store) {
          keep_draws <- good_target_dist$draw[order(-good_target_dist$n_good, good_target_dist$tot_dist)][1:n_keep]
          good_parm_draws[1:n_keep, ] <- good_parm_draws[draw %in% keep_draws, ]
          good_sim_target [1:n_keep, ] <- good_sim_target[draw %in% keep_draws, ]
          good_target_dist[1:n_keep, ] <- good_target_dist[draw %in% keep_draws, ]
        }

        # Determine rows that will be updated in good_*
        start_add_row <- n_keep + 1
        end_add_row <- ifelse((current_good_n + iter_valid_n) > n_store, n_store, current_good_n + iter_valid_n)
        update_row_range <- start_add_row:end_add_row

        # Determine rows that will be used to update from iter_*
        valid_rows <- iter_target_dist$n_good == n_target_distances & iter_target_dist$step < (N_centers + 1)
        order_valid_rows <- order(-iter_target_dist$n_good[valid_rows], iter_target_dist$tot_dist[valid_rows])
        add_draws <- sort(iter_target_dist$draw[valid_rows][order_valid_rows][1:length(update_row_range)])

        # Transfer good rows to results data.tables
        good_parm_draws[update_row_range, ] <- iter_parm_draws[draw %in% add_draws, ]
        good_sim_target[update_row_range, ] <- iter_sim_target[draw %in% add_draws, ]
        good_target_dist[update_row_range, ] <- iter_target_dist[draw %in% add_draws, ]

        # Make sure total distance is correct in good_target_dist
        if (length(update_targets) == 0) {
          good_target_dist[update_row_range, tot_dist := total_distance(
            dt = good_target_dist[update_row_range, ],
            target_names = target_distance_names,
            scale = FALSE
          )]
        } else {
          good_target_dist[update_row_range, tot_dist := total_distance(
            dt = good_target_dist[update_row_range, ],
            target_names = update_targets,
            scale = FALSE
          )]
        }
      } # iter_valid_n > 0

      # Update count of final valid draws
      current_good_n <- sum(good_target_dist$n_good == n_target_distances, na.rm = TRUE)
    } # !(continue_runs == TRUE & iter == start_iter)

    # Which valid rows should be used for simulating new draws
    good_row_range <- which(good_target_dist$n_good == n_target_distances)

    # Improve Target Bounds ---------------------------------------------------------------------------------------------
    if ((current_good_n >= 2*N_cov_points) & length(update_targets) > 0) {
      # Sort targets based on overall distance of target distances still being calibrated
      good_target_dist$tot_dist <- total_distance(dt = good_target_dist, target_names = update_targets, scale = FALSE)

      # Find least amount of points that move us towards the stopping bounds while letting us have enough for the more
      #   complex resampling method
      keep_points <- c(trunc(seq(1.0, 2.0, 0.1)*N_cov_points), current_good_n)
      for (test_n_iter in keep_points) {
        # If we can use less than the entire sample, calculate bounds with the subset. Otherwise, bounds do not change
        if (test_n_iter < current_good_n) {
          # Get the test_n_iter best draws to determine an empirical set of new bounds
          get_draws <- good_target_dist$draw[order(good_target_dist$tot_dist)][1:test_n_iter]

          # Get new bounds using test_n_iter best draws
          targets <- get_new_bounds(
            to_update = update_targets,
            targets_list = targets,
            sims = good_sim_target[draw %in% get_draws, ]
          )
        } else { # test_n_iter < current_good_n
          # set the new bounds back to the current bounds
          targets <- update_target_bounds(targets, from = "new", to = "current")
        } # ! test_n_iter < current_good_n

        # Find if any targets have moved closer to the stopping bounds
        improve <- any(
          targets$new_lower_bounds != targets$current_lower_bounds |
            targets$new_upper_bounds != targets$current_upper_bounds
          )

        # If we can improve any targets, test our current valid draws against new bounds OR
        # If we have no more sample to use, reset our evaluation of good draws to the current bounds set
        if (improve | test_n_iter == current_good_n) {
          # Update distances (values < 0 indicate out of tolerance bounds)
          good_target_dist[good_row_range, (update_targets) := eval_targets(
            sim_targets = good_sim_target[good_row_range, ],
            target_list = targets[update_targets, groups_given = TRUE],
            criteria = "update"
          )]
          # Update number of targets that were hit for each draw
          good_target_dist$n_good <- 0L
          good_target_dist[good_row_range]$n_good <- rowSums(
            good_target_dist[good_row_range, (target_distance_names), with = FALSE] >= 0,
            na.rm = TRUE
          )
          # Calculate new number of valid draws
          new_iter_valid_n <- sum(good_target_dist$n_good == n_target_distances, na.rm = TRUE)

          # If we have enough new draws, stop looking for more by stopping the test_n_iter loop
          if (new_iter_valid_n >= N_cov_points) { break }
        } # any(improve) | test_n_iter == current_good_n
      } # test_n_iter in keep_points

      # If we consticted our target ranges with fewer draws
      if (new_iter_valid_n >= N_cov_points & new_iter_valid_n < current_good_n) {
        # Update current bounds to the newly calculated bounds
        targets <- update_target_bounds(targets, from = "current", to = "new")

        # Update current_good_n
        current_good_n <- new_iter_valid_n

        # Determine which distances have been constricted as far as we want
        update_targets <- attr(targets, "update") <- get_update_targets(targets)

        # Move the best draws to top of good_target_dist and remove the rest
        keep_draws <- good_target_dist$draw[good_target_dist$n_good == n_target_distances]
        remove_rows <- !good_target_dist$draw %in% keep_draws
        good_parm_draws[remove_rows, c("draw", "step", "iter") := NA_integer_]
        good_parm_draws[remove_rows, c(all_parm_names, "scaled_dist", "sample_wt") := NA_real_]
        good_parm_draws[remove_rows, "seed" := NA_character_]
        good_sim_target[remove_rows, c("draw", "step", "iter") := NA_integer_]
        good_sim_target[remove_rows, (sim_target_names) := NA_real_]
        good_target_dist[remove_rows, c("draw", "step", "iter", "n_good") := NA_integer_]
        good_target_dist[remove_rows, c(target_distance_names, "tot_dist") := NA_real_]
        setorder(good_parm_draws, draw, na.last = TRUE)
        setorder(good_sim_target, draw, na.last = TRUE)
        setorder(good_target_dist, draw, na.last = TRUE)

        # Update the good_row_range for later use
        good_row_range <- 1:current_good_n
        updated_bounds <- TRUE # For next print
      } else { # new_iter_valid_n >= N_cov_points & new_iter_valid_n < current_good_n
        updated_bounds <- FALSE # For next print
      } # ! new_iter_valid_n >= N_cov_points & new_iter_valid_n < current_good_n

      # Print -----------------------------------------------------------------------------------------------------------
      if (verbose & !is_first_continue_iter) {
        n_info <- sprintf("New in range draws = %s", current_good_n)
        bounds_to_print <- names(attr(targets, "update"))
        bound_info <- lapply(attr(targets,"target_names"), FUN = function(x, targ, to_print) {
          if (x %in% to_print) {
            tmp <- targ[x]
            y <- paste(sprintf("%s: %s - %s", x, tmp$current_lower_bounds, tmp$current_upper_bounds), collapse = "\n")
          } else {
            y <- paste(sprintf("%s: Reached stopping bounds", x), collapse = "\n")
          }
          return(y)
        }, targ = targets, to_print = bounds_to_print)
        bound_info <- lapply(unique(attr(targets, "target_groups")), FUN = function(x, groups, bound_info) {
          paste(unlist(bound_info[x == groups]), collapse = "\n")
        }, groups = attr(targets, "target_groups"), bound_info = bound_info)
        bound_info <- paste(sprintf("---- %s ----", unique(attr(targets, "target_groups"))), bound_info, sep = "\n", collapse = "\n")
        bound_info <- ifelse(bound_info == "", "All tolerance intervals at target bounds", bound_info)
        if (updated_bounds) {
          cat("Updated bounds:", n_info, bound_info, sep = "\n")
        } else {
          cat("Unable to update bounds:", n_info, bound_info, sep = "\n")
        }
      }
    } # (current_good_n >= 2*N_cov_points) & length(update_targets) > 0

    # Calculate Effective Sample Size -----------------------------------------------------------------------------------
    # Only really need weights in these scenarios but will calculate everytime for potential debugging
    # if ((current_good_n >= N_post | length(update_targets) == 0 | (main_loop_iter >= end_iter & current_good_n > 0)) &
    if (main_loop_iter > 1) {
      # Calculate sample weight
      good_parm_draws$sample_wt <- 0
      in_draws <- good_target_dist$draw[!is.na(good_target_dist$draw)]
      good_parm_draws[good_parm_draws$draw %in% in_draws, sample_wt :=
                        get_weight(
                          parms = good_parm_draws[draw %in% in_draws, ],
                          parm_names = calibr_parm_names,
                          priors = priors[calibr_parm_names],
                          mixture_file = mean_cov,
                          n = N_start
                        )]

      # Calculate effective sample size using Kish formula. Here sum(sample_wt) = 1
      ESS <- 1/sum(good_parm_draws$sample_wt[good_parm_draws$draw %in% in_draws]^2)

      # Print information
      if (verbose & !is_first_continue_iter) {
        cat(sprintf("Effective Sample Size is %s", round(ESS, 2)), "\n")
      }
    } # current_good_n >= N_post | length(update_targets) == 0 | (main_loop_iter >= end_iter & current_good_n > 0)

    # Check Completion --------------------------------------------------------------------------------------------------
    if (ESS >= N_post & length(update_targets) == 0) {
      # Print information
      if (verbose) {
        n_info <- sprintf("Final in range draws = %s", current_good_n)
        ess_info <- sprintf("Effective Sample Size was %s", round(ESS, 2))
        target_info <- sprintf("(Target was %s)", N_post)
        cat(n_info, ess_info, target_info, sep = "\n")
      }

      # Have met the appropriate sample size. Stop looking and return final results
      break
    }

    # Simulate New Draws ------------------------------------------------------------------------------------------------
    if (main_loop_iter < end_iter || main_loop_iter == 1) {
      # Find the N_center in range draws with model predictions closest to targets
      # Check for high-weight points, defined as $\theta_i$ with $w_i> 10/\sum_{i=1}^{N_{(t+1)}}$, meaning that the
      #   maximum weight is 10 times greater than expected for a simple random sample from in-range points
      n_hiwt <- 0
      center_draw_hiwt <- NULL
      if (length(update_targets) == 0 & main_loop_iter > 1) {
        max_wt <- max(good_parm_draws$sample_wt[good_parm_draws$draw %in% in_draws])
        if (max_wt >= 10/current_good_n) {
          draw_order <- setorder(good_parm_draws[good_row_range, ], -sample_wt, na.last = TRUE)$draw
          n_hiwt <- min(N_centers, length(draw_order))
          center_draw_hiwt <- draw_order[1:n_hiwt]

          # Print information
          if (verbose) {
            cat("Adding samples around high weight points", paste(center_draw_hiwt, collapse = ", "), "\n")
          }
        }
      } # length(update_targets) == 0

      # If we didn't get enough centers from high-weight points, use total distance instead
      n_best_draw <- 0
      center_draw_best <- NULL
      if (n_hiwt < N_centers) {
        # Pull the draw numbers while ordering them based on their total distance
        draw_order <- good_target_dist$draw[good_row_range][order(good_target_dist$tot_dist[good_row_range])]
        n_best_draw <- min(current_good_n, (N_centers - n_hiwt))
        center_draw_best <- draw_order[1:n_best_draw]
      } # n_hiwt < N_centers

      # Create center info objects for calculations
      num_centers <- n_best_draw + n_hiwt
      center_draw <- sort(c(center_draw_best, center_draw_hiwt))
      center_next <- as.matrix(good_parm_draws[draw %in% center_draw, all_parm_names, with = FALSE])

      # Calculate number of new draws and new steps (excluding centers)
      n_draw <- num_centers*Center_n # + num_centers
      new_steps <- rep(1:num_centers, each = Center_n) # rep.int((N_centers + 1), num_centers)

      # Reset iteration level info
      iter_parm_draws$iter <- iter_sim_target$iter <- iter_target_dist$iter <- main_loop_iter + 1
      iter_parm_draws$draw <- iter_sim_target$draw <- iter_target_dist$draw <- NA_integer_
      iter_parm_draws$step <- iter_sim_target$step <- iter_target_dist$step <- NA_integer_
      iter_parm_draws$step[1:n_draw] <- iter_sim_target$step[1:n_draw] <- iter_target_dist$step[1:n_draw] <- as.integer(new_steps)
      iter_parm_draws[, c(all_parm_names, "scaled_dist", "sample_wt") := NA_real_]
      iter_parm_draws$seed <- NA_character_
      iter_sim_target[, (sim_target_names) := NA_real_]
      iter_target_dist[, c(target_distance_names, "tot_dist") := NA_real_]
      iter_target_dist$n_good <- 0L

      # New draw sampling
      if (current_good_n < N_cov_points) {
        # If there are too few good points use independent draws from a truncate norm
        # Pull standard deviations of parameters provided with the prior information
        prior_sds <- attr(priors, "sds")
        sd_next <- matrix(0.5*prior_sds, ncol = n_parms, nrow = num_centers, byrow = TRUE)
        colnames(sd_next) <- names(priors)
        # Perform random draws
        iter_parm_draws[1:n_draw, (all_parm_names) := draw_parms(
          n_add = Center_n,
          mu = center_next[, all_parm_names],
          sigma = sd_next,
          priors_list = priors
        )]

        # Set draw numbers
        new_draws <- (total_draws + 1):(total_draws + n_draw)
        iter_parm_draws$draw[1:n_draw] <- iter_target_dist$draw[1:n_draw] <- iter_sim_target$draw[1:n_draw] <- new_draws
        # Set draw seed
        iter_parm_draws$seed[1:n_draw] <- seed_stream(seed_stream_start, n_draw)
        seed_stream_start <- as.integer(unlist(strsplit(iter_parm_draws[["seed"]][n_draw], "_")))

        # Recalculate center
        iter_parm_draws[iter_parm_draws$step == (N_centers + 1), (all_parm_names)] <- as.data.table(center_next)
        iter_parm_draws$draw[iter_parm_draws$step == (N_centers + 1) & !is.na(iter_parm_draws$step)] <- center_draw
        iter_sim_target$draw[iter_parm_draws$step == (N_centers + 1) & !is.na(iter_parm_draws$step)] <- center_draw
        iter_target_dist$draw[iter_parm_draws$step == (N_centers + 1) & !is.na(iter_parm_draws$step)] <- center_draw

        # Store results (for mean_cov)
        x <- iter_parm_draws[1:n_draw, c("iter", "step"), with = FALSE]
        x$in_range <- get_in_range(
          compare_list = priors[calibr_parm_names],
          check_dt = iter_parm_draws[1:n_draw, calibr_parm_names, with = FALSE],
          out = "logical"
        )
        setkey(x, iter, step)
        B_in <- x[step <= N_centers, list(B.in = sum(in_range, na.rm = TRUE)), by = list(iter, step)]
        new_mean_cov <- get_mean_cov(
          iter = main_loop_iter + 1,
          mu = center_next[, calibr_parm_names],
          sd = sd_next[, calibr_parm_names],
          center = center_draw,
          B_in = B_in,
          parm_names = calibr_parm_names
        )
        # If mean_cov has already been created, add new values in. Otherwise create mean_cov with values
        if (exists("mean_cov")) {
          new_rows <- (nrow(mean_cov) + 1):(nrow(mean_cov) + nrow(new_mean_cov))
          mean_cov <- rbind(mean_cov, new_mean_cov)
        } else {
          mean_cov <- new_mean_cov
          new_rows <- 1:nrow(mean_cov)
        }

      } else { # current_good_n < N_cov_points
        # sample MVN points around centers if there are enough points to estimate the cov matrix
        sample_mean <- as.data.frame(center_next)

        # For tracking which rows need to be written out
        new_rows <- c()
        for (center_sim_iter in 1:num_centers) {
          # center_sim_iter means for calibrated parameters
          sample_mean_i1 <- unlist(sample_mean[center_sim_iter, calibr_parm_names])
          if (is.null(names(sample_mean_i1))) {
            names(sample_mean_i1) <- names(priors[calibr_parm_names])
          }
          # Rows to be filled for a given center
          draw_rows <- ((center_sim_iter - 1)*Center_n + 1):(center_sim_iter*Center_n)

          # Calculate parameter distances from given center
          prior_sds <- attr(priors[calibr_parm_names], "sds")
          good_parm_draws$scaled_dist <- Inf
          good_parm_draws$scaled_dist[1:current_good_n] <- total_distance(
            dt = good_parm_draws[1:current_good_n, ],
            target_names = calibr_parm_names,
            mu = sample_mean_i1,
            sd = prior_sds,
            scale = TRUE
          )

          # Find nearest points based on scaled distance
          calc_cov_points <- 1:max(N_cov_points, trunc(current_good_n/N_centers))
          var_data <- good_parm_draws[order(good_parm_draws$scaled_dist)[calc_cov_points], calibr_parm_names, with = FALSE]

          # Find center specific var-cov matrices using N_cov_points closest points
          sample_cov <- parm_covariance(var_data)

          # Deal with parameters that don't have any variance
          if (any(diag(sample_cov) <= 0.05*prior_sds)) {
            # This occurs when adding a new parameter: it is set to default for all prior draws
            is_zero <- (diag(sample_cov) == 0)
            sd_next <- 0.05*prior_sds
            sd_next[!is_zero] <- 0
            diag(sample_cov) <- diag(sample_cov) + (sd_next^2)
          }

          # Simulate Center_n random draws of calibrated parameters
          if (abs(sum(sample_cov) - sum(diag(sample_cov))) < 1e-10) {
            # Perform independent draws - uses truncated normal
            x <- draw_parms(
              n_add = Center_n,
              mu = as.matrix(t(sample_mean_i1)),
              sigma = as.matrix(t(diag(sample_cov))),
              priors_list = priors[calibr_parm_names]
            )

          } else { # abs(sum(sample_cov) - sum(diag(sample_cov))) < 1e-10
            # Perform multivariate draws - uses multivariate normal
            x <- get_B_draws(
              B = Center_n,
              inflate = sample_inflate,
              center = sample_mean_i1,
              cov_matrix = sample_cov,
              priors = priors[calibr_parm_names]
            )

          } # ! abs(sum(sample_cov) - sum(diag(sample_cov))) < 1e-10

          # Store Mean Covariance results
          sample_mean_i1 <- setnames(as.data.frame(t(sample_mean_i1)), calibr_parm_names)
          sample_cov <- setnames(as.data.frame(sample_cov), calibr_parm_names)
          mean_cov_center_i1 <- data.table(rbind(sample_mean_i1, sample_cov))
          mean_cov_center_i1$iter <- main_loop_iter + 1
          mean_cov_center_i1$step <- center_sim_iter
          mean_cov_center_i1$center <- center_draw[center_sim_iter]
          mean_cov_center_i1$parm <- 0:(length(calibr_parm_names))
          B_in <- sum(get_in_range(
            compare_list = priors[calibr_parm_names],
            check_dt = x,
            out = "logical"
          ))
          mean_cov_center_i1$B.in <- B_in
          setcolorder(mean_cov_center_i1, c("iter", "step", "center", "B.in" , "parm", calibr_parm_names))
          if (exists("mean_cov")) { # After first iteration or on continuing run
            new_rows <- c(new_rows, (nrow(mean_cov) + 1):(nrow(mean_cov) + nrow(mean_cov_center_i1)))
            mean_cov <- rbind(mean_cov, mean_cov_center_i1)
          } else {
            new_rows <- 1:nrow(mean_cov_center_i1)
            mean_cov <- mean_cov_center_i1
          }

          # If no rows were generated, move to next center without saving NULL to the iter_parm_draws
          if (B_in == 0) {
            warning(sprintf("No valid parameters were simulated for center = %s during iteration %s", center_sim_iter, main_loop_iter))
            # Move to the next center
            next
          }

          # Add simulated values of calibrated data to iter_parm_draws
          iter_parm_draws[draw_rows, (calibr_parm_names) := x]
          # Add fixed values of non calibrated data to iter_parm_draws
          non_calibr <- setdiff(all_parm_names, calibr_parm_names)
          if (length(non_calibr) > 0) {
            iter_parm_draws[draw_rows, ] <- parms_from_priors(
              parm_df = iter_parm_draws[draw_rows, ],
              name_parms = non_calibr,
              prior_list = priors, sampling = NULL
            )
          }
        } # center_sim_iter in 1:num_centers

        # See how many new valid parameters have been generated
        keep_rows <- get_in_range(
          compare_list = priors,
          check_dt = iter_parm_draws,
          out = "logical"
        )
        if (sum(keep_rows) == 0) {
          warning(sprintf("Unable to generate new parameters for any center at iteration %s. Ending run.", main_loop_iter))
          break
        }

        # Update iteration parm draws with which newly generated rows should be kept for next iteration
        # Reset step column to NA for non-valid rows
        iter_parm_draws$step[!keep_rows] <- iter_target_dist$step[!keep_rows] <- iter_sim_target$step[!keep_rows] <- NA
        # Set draw numbers
        n_draw <- sum(keep_rows)
        new_draws <- (total_draws + 1):(total_draws + n_draw)
        iter_parm_draws$draw[keep_rows] <- iter_target_dist$draw[keep_rows] <- iter_sim_target$draw[keep_rows] <- new_draws
        # reorder data.tables so NAs are on bottom
        setorder(iter_parm_draws, draw, iter, step, na.last = TRUE)
        setorder(iter_sim_target, draw, iter, step, na.last = TRUE)
        setorder(iter_target_dist, draw, iter, step, na.last = TRUE)
        # Set draw seed
        iter_parm_draws$seed[1:n_draw] <- seed_stream(seed_stream_start, n_draw)
        seed_stream_start <- as.integer(unlist(strsplit(iter_parm_draws[["seed"]][n_draw], "_")))

        # Recalculate centers
        center_rows <- (n_draw + 1):(n_draw + N_centers)
        iter_parm_draws[center_rows, (all_parm_names)] <- as.data.table(center_next)
        iter_parm_draws[center_rows, ]$draw <- center_draw
        iter_sim_target[center_rows, ]$draw <- center_draw
        iter_target_dist[center_rows, ]$draw <- center_draw
        iter_parm_draws[center_rows, ]$step <- N_centers + 1
        iter_sim_target[center_rows, ]$step <- N_centers + 1
        iter_target_dist[center_rows, ]$step <- N_centers + 1
        iter_parm_draws[center_rows, ]$seed <- seed_stream(seed_stream_start, N_centers)
        seed_stream_start <- as.integer(unlist(strsplit(iter_parm_draws[["seed"]][center_rows[N_centers]], "_")))
        n_draw <- n_draw + N_centers

        # put good_* data tables in draw order
        setorder(good_parm_draws, draw, na.last = TRUE)
        setorder(good_sim_target, draw, na.last = TRUE)
        setorder(good_target_dist, draw, na.last = TRUE)
      } # ! current_good_n < N_cov_points
    } # if (main_loop_iter < end_iter)

    # # Save iteration results
    # if (!is.null(output_directory)) {
    #   if (main_loop_iter < end_iter || main_loop_iter == 1) {
    #     if (append_meancov_outfile) {
    #       save_rows <- new_rows
    #     } else {
    #       save_rows <- 1:nrow(mean_cov)
    #     }
    #     save_results(
    #       mean_cov[save_rows, c("iter", "step", "center", "B.in", "parm", calibr_parm_names), with = FALSE], meancov_outfile,
    #       out_dir = output_directory, append = append_meancov_outfile
    #     )
    #     append_meancov_outfile <- TRUE
    #   }
    #   # Pull complete options except for
    #   metaddata <- data.frame(
    #     info = c("current_iteration", "last_draw"),
    #     value = c(main_loop_iter, total_draws)
    #   )
    #   # Pull function inputs (except for target_fun, priors, and targets)
    #   fn_ops <- mget(names(formals()), sys.frame(sys.nframe()))
    #   PickOut <- unlist(lapply(fn_ops, function(x) !(is.function(x) | is.list(x))))
    #   fn_ops <- unlist(fn_ops[PickOut])
    #   fn_ops <- data.frame(info = names(fn_ops), value = fn_ops, row.names = NULL)
    #   metaddata <- rbind(metaddata, fn_ops)
    #   if (validate_run) {
    #     save_good_parm_draws <- copy(good_parm_draws)[, scaled_dist := NULL, ]
    #     save_good_parm_draws <- save_good_parm_draws[!is.na(draw), ]
    #     save_good_parm_draws$actual_iter <- main_loop_iter
    #     save_good_sim_target <- copy(good_sim_target)
    #     save_good_sim_target <- save_good_sim_target[!is.na(draw), ]
    #     save_good_sim_target$actual_iter <- main_loop_iter
    #     save_good_target_dist <- copy(good_target_dist)
    #     save_good_target_dist <- save_good_target_dist[!is.na(draw), ]
    #     save_good_target_dist$actual_iter <- main_loop_iter
    #   } else {
    #     save_good_parm_draws <- copy(good_parm_draws)[, scaled_dist:= NULL, ]
    #     save_good_sim_target <- good_sim_target
    #     save_good_target_dist <- good_target_dist
    #   }
    #   # Write out files
    #   save_results(
    #     list(metaddata, runmeta_df_outfile),
    #     list(as.data.frame(targets), targlist_df_outfile),
    #     out_dir = output_directory, append = FALSE
    #   )
    #   save_results(
    #     list(save_good_parm_draws, parm_df_outfile),
    #     list(save_good_sim_target, simtarg_df_outfile),
    #     list(save_good_target_dist, targdist_df_outfile),
    #     out_dir = output_directory, append = append_iter_outfile # FALSE
    #   )
    #   if (validate_run) {
    #     append_iter_outfile <- TRUE
    #   }
    # } # Save iteration results
    if (verbose & !is_first_continue_iter) { cat(sprintf("----------- End Iter %s -----------\n", main_loop_iter)) }
  } # main_loop_iter in start_iter:end_iter

  # Prep final results
  # Valid Parameters
  good_parm_draws <- good_parm_draws[!is.na(draw), ]
  setorder(good_parm_draws, draw, na.last = TRUE)
  # Valid Targets
  good_sim_target <- good_sim_target[!is.na(draw), ]
  setorder(good_sim_target, draw, na.last = TRUE)
  # Valid Distances
  good_target_dist <- good_target_dist[!is.na(draw), ]
  good_target_dist[, (target_distance_names) := lapply(.SD , "abs"), .SDcols = target_distance_names]
  setorder(good_target_dist, draw, na.last = TRUE)

  # Save current iteration and last draw info for restart
  metaddata <- data.frame(
    info = c("current_iteration", "last_draw"),
    value = c(main_loop_iter, total_draws)
  )
  # Pull function inputs (except for target_fun, priors, and targets)
  fn_ops <- mget(names(formals()), sys.frame(sys.nframe()))
  PickOut <- unlist(lapply(fn_ops, function(x) !(is.function(x) | is.list(x))))
  fn_ops <- unlist(fn_ops[PickOut])
  fn_ops <- data.frame(info = names(fn_ops), value = fn_ops, row.names = NULL)
  metaddata <- rbind(metaddata, fn_ops)
  # Save
  if (!is.null(output_directory)) {
    save_results(
      list(metaddata, runmeta_df_outfile),
      list(as.data.frame(targets), targlist_df_outfile),
      out_dir = output_directory, append = FALSE
    )
    if (validate_run) {
      save_good_parm_draws <- copy(good_parm_draws)[, scaled_dist := NULL, ]
      save_good_parm_draws <- save_good_parm_draws[!is.na(draw), ]
      save_good_parm_draws$actual_iter <- main_loop_iter
      save_good_sim_target <- copy(good_sim_target)
      save_good_sim_target <- save_good_sim_target[!is.na(draw), ]
      save_good_sim_target$actual_iter <- main_loop_iter
      save_good_target_dist <- copy(good_target_dist)
      save_good_target_dist <- save_good_target_dist[!is.na(draw), ]
      save_good_target_dist$actual_iter <- main_loop_iter
    } else {
      save_good_parm_draws <- copy(good_parm_draws)[, scaled_dist:= NULL, ]
      save_good_sim_target <- good_sim_target
      save_good_target_dist <- good_target_dist
    }
    save_results(
      list(save_good_parm_draws, parm_df_outfile),
      list(save_good_sim_target, simtarg_df_outfile),
      list(save_good_target_dist, targdist_df_outfile),
      out_dir = output_directory, append = append_iter_outfile
    )
  }

  # Print information
  done_timestamp <- Sys.time()
  calc_time <- done_timestamp - run_timestamp
  if (verbose) {
    header <- "\n---------- Run Completed ----------"
    time_info <- sprintf("Finishing ABC at %s", format(done_timestamp, print_time_fmt))
    duration_info <- sprintf("Calculation took %s %s", round(calc_time, 2), attr(calc_time, "units"))
    seed_info <- sprintf("seed value is: %s", paste0(.Random.seed, collapse = ", "))
    n_info <- sprintf("Number of in range draws was %s", current_good_n)
    bound_info <- lapply(attr(targets, "target_names"), FUN = function(x, targ) {
      tmp <- targ[x]
      paste(sprintf("%s: %s - %s", x, tmp$current_lower_bounds, tmp$current_upper_bounds), collapse = "\n")
    }, targ = targets)
    bound_info <- lapply(unique(attr(targets, "target_groups")), FUN = function(x, groups, bound_info) {
      paste(unlist(bound_info[x == groups]), collapse = "\n")
    }, groups = attr(targets, "target_groups"), bound_info = bound_info)
    bound_info <- paste(sprintf("---- %s ----", unique(attr(targets, "target_groups"))), bound_info, sep = "\n", collapse = "\n")
    cat(header, time_info, duration_info, seed_info, n_info, "Final bounds info:", bound_info, sep = "\n")
  }

  # If user wants all intermediate results saved
  if (validate_run) {
    out_list <- list(
      all_iter_parm_draws = all_iter_parm_draws[, scaled_dist:= NULL],
      all_iter_sim_target = all_iter_sim_target,
      all_iter_target_dist = all_iter_target_dist,
      good_parm_draws = good_parm_draws[, scaled_dist:= NULL],
      good_sim_target = good_sim_target,
      good_target_dist = good_target_dist,
      mean_cov = mean_cov,
      priors = priors,
      targets = targets,
      metaddata = metaddata
    )
  } else {
    out_list <- list(
      good_parm_draws = good_parm_draws[, scaled_dist:= NULL],
      good_sim_target = good_sim_target,
      good_target_dist = good_target_dist,
      mean_cov = mean_cov,
      priors = priors,
      targets = targets,
      metaddata = metaddata
    )
  }

  return(out_list)
}

#########################################################################################################################
#########################################################################################################################
#########################################################################################################################
#########################################################################################################################
#########################################################################################################################
carolyner/imabc documentation built on March 28, 2023, 6:43 a.m.