simChef: Intensive Computational Experiments Made Easy

eval_pred_err_funs

R Documentation

Evaluate and/or summarize prediction errors.

Description

Evaluate various prediction error metrics, given the true responses and the predicted (or estimated) responses. eval_pred_err() evaluates the various prediction error metrics for each experimental replicate separately. summarize_pred_err() summarizes the various prediction error metrics across experimental replicates.

Usage

eval_pred_err(
  fit_results,
  vary_params = NULL,
  nested_cols = NULL,
  truth_col,
  estimate_col,
  prob_cols = NULL,
  group_cols = NULL,
  metrics = NULL,
  na_rm = FALSE
)

summarize_pred_err(
  fit_results,
  vary_params = NULL,
  nested_cols = NULL,
  truth_col,
  estimate_col,
  prob_cols = NULL,
  group_cols = NULL,
  metrics = NULL,
  na_rm = FALSE,
  summary_funs = c("mean", "median", "min", "max", "sd", "raw"),
  custom_summary_funs = NULL,
  eval_id = "pred_err"
)

Arguments

`fit_results`	A tibble, as returned by `fit_experiment()`.
`vary_params`	A vector of `DGP` or `Method` parameter names that are varied across in the `Experiment`.
`nested_cols`	(Optional) A character string or vector specifying the name of the column(s) in `fit_results` that need to be unnested before evaluating results. Default is `NULL`, meaning no columns in `fit_results` need to be unnested prior to computation.
`truth_col`	A character string identifying the column with the true responses. The column should be numeric for a regression problem and a factor for a classification problem.
`estimate_col`	A character string identifying the column with the estimated or predicted responses. The column should be numeric for a regression problem and a factor (with the predicted classes) for a classification problem.
`prob_cols`	A character string or vector identifying the column(s) containing class probabilities. If the `truth_col` column is binary, only 1 column name should be provided. Otherwise, the length of the `prob_cols` should be equal to the number of factor levels of the `truth_col` column. This argument is not used when evaluating numeric metrics.
`group_cols`	(Optional) A character string or vector specifying the column(s) to group rows by before evaluating metrics. This is useful for assessing within-group metrics.
`metrics`	A `metric_set` object indicating the metrics to evaluate. See `yardstick::metric_set()` for more details. Default `NULL` will use the default metrics in `yardstick::metrics()`.
`na_rm`	A `logical` value indicating whether `NA` values should be stripped before the computation proceeds.
`summary_funs`	Character vector specifying how to summarize evaluation metrics. Must choose from a built-in library of summary functions - elements of the vector must be one of "mean", "median", "min", "max", "sd", "raw".
`custom_summary_funs`	Named list of custom functions to summarize results. Names in the list should correspond to the name of the summary function. Values in the list should be a function that takes in one argument, that being the values of the evaluated metrics.
`eval_id`	Character string. ID to be used as a suffix when naming result columns. Default `NULL` does not add any ID to the column names.

Value

The output of eval_pred_err() is a tibble with the following columns:

.rep: Replicate ID.
.dgp_name: Name of DGP.
.method_name: Name of Method.
.metric: Name of the evaluation metric.
.estimate: Value of the evaluation metric.

as well as any columns specified by group_cols and vary_params.

The output of summarize_pred_err() is a grouped tibble containing both identifying information and the prediction error results aggregated over experimental replicates. Specifically, the identifier columns include .dgp_name, .method_name, any columns specified by group_cols and vary_params, and .metric. In addition, there are results columns corresponding to the requested statistics in summary_funs and custom_summary_funs. These columns end in the suffix specified by eval_id.

Examples

############################
#### Regression Problem ####
############################

# generate example fit_results data for a regression problem
fit_results <- tibble::tibble(
  .rep = rep(1:2, times = 2),
  .dgp_name = c("DGP1", "DGP1", "DGP2", "DGP2"),
  .method_name = c("Method"),
  # true response
  y = lapply(1:4, FUN = function(x) rnorm(100)),
  # predicted response
  predictions = lapply(1:4, FUN = function(x) rnorm(100)),
  group = lapply(1:4, FUN = function(x) rep(c("a", "b"), length.out = 100))
)

# evaluate prediction error (using all default metrics) for each replicate
eval_results <- eval_pred_err(fit_results,
                              truth_col = "y",
                              estimate_col = "predictions")
# summarize prediction error (using all default metric) across replicates
eval_results_summary <- summarize_pred_err(fit_results,
                                           truth_col = "y",
                                           estimate_col = "predictions")

# evaluate/summarize prediction error within subgroups
eval_results <- eval_pred_err(fit_results,
                              truth_col = "y",
                              estimate_col = "predictions",
                              group_cols = "group")
eval_results_summary <- summarize_pred_err(fit_results,
                                           truth_col = "y",
                                           estimate_col = "predictions",
                                           group_cols = "group")

# evaluate/summarize prediction errors using specific yardstick metrics
metrics <- yardstick::metric_set(yardstick::rmse, yardstick::rsq)
eval_results <- eval_pred_err(fit_results,
                              truth_col = "y",
                              estimate_col = "predictions",
                              metrics = metrics)
eval_results_summary <- summarize_pred_err(fit_results,
                                           truth_col = "y",
                                           estimate_col = "predictions",
                                           metrics = metrics)

# summarize prediction errors using specific summary metric
range_fun <- function(x) return(max(x) - min(x))
eval_results_summary <- summarize_pred_err(
  fit_results,
  truth_col = "y",
  estimate_col = "predictions",
  custom_summary_funs = list(range_pred_err = range_fun)
)

#######################################
#### Binary Classification Problem ####
#######################################
# generate example fit_results data for a binary classification problem
fit_results <- tibble::tibble(
  .rep = rep(1:2, times = 2),
  .dgp_name = c("DGP1", "DGP1", "DGP2", "DGP2"),
  .method_name = c("Method"),
  # true response
  y = lapply(1:4,
             FUN = function(x) {
               as.factor(sample(0:1, size = 100, replace = TRUE))
             }),
  # predicted class probabilities
  class_probs = lapply(1:4, FUN = function(x) runif(n = 100, min = 0, max = 1)),
  # predicted class responses
  predictions = lapply(class_probs,
                       FUN = function(x) as.factor(ifelse(x > 0.5, 1, 0)))
)

# evaluate prediction error (using all default metrics) for each replicate
eval_results <- eval_pred_err(fit_results,
                              truth_col = "y",
                              estimate_col = "predictions",
                              prob_cols = "class_probs")
# summarize prediction error (using all default metric) across replicates
eval_results_summary <- summarize_pred_err(fit_results,
                                           truth_col = "y",
                                           estimate_col = "predictions",
                                           prob_cols = "class_probs")

# can also evaluate results using only class predictions (without class probs.)
eval_results <- eval_pred_err(fit_results,
                              truth_col = "y",
                              estimate_col = "predictions")
eval_results_summary <- summarize_pred_err(fit_results,
                                           truth_col = "y",
                                           estimate_col = "predictions")

############################################
#### Multi-class Classification Problem ####
############################################
# generate example fit_results data for a multi-class classification problem
fit_results <- tibble::tibble(
  .rep = rep(1:2, times = 2),
  .dgp_name = c("DGP1", "DGP1", "DGP2", "DGP2"),
  .method_name = c("Method"),
  # true response
  y = lapply(1:4,
             FUN = function(x) {
               as.factor(sample(c("a", "b", "c"), size = 100, replace = TRUE))
             }),
  # predicted class probabilities
  class_probs = lapply(1:4,
                       FUN = function(x) {
                         tibble::tibble(a = runif(n = 100, min = 0, max = 0.5),
                                        b = runif(n = 100, min = 0, max = 0.5),
                                        c = 1 - a - b)
                       }),
  # predicted class responses
  predictions = lapply(class_probs,
                       FUN = function(x) {
                         yhat <- apply(x, 1,
                                       FUN = function(xi) names(which.max(xi)))
                         return(as.factor(yhat))
                       })
)

# evaluate prediction error (using all default metrics) for each replicate
eval_results <- eval_pred_err(fit_results,
                              truth_col = "y",
                              estimate_col = "predictions",
                              prob_cols = c("a", "b", "c"),
                              nested_cols = c("y", "class_probs", "predictions"))
#' summarize prediction error (using all default metric) across replicates
eval_results_summary <- summarize_pred_err(fit_results,
                                           truth_col = "y",
                                           estimate_col = "predictions",
                                           prob_cols = c("a", "b", "c"),
                                           nested_cols = c("y", "class_probs", "predictions"))

# can also evaluate results using only class predictions (without class probs.)
eval_results <- eval_pred_err(fit_results,
                              truth_col = "y",
                              estimate_col = "predictions")
eval_results_summary <- summarize_pred_err(fit_results,
                                           truth_col = "y",
                                           estimate_col = "predictions")

Yu-Group/simChef documentation built on Feb. 27, 2025, 9:19 p.m.