eval_testing_err_funs | R Documentation |
Evaluate various testing error metrics, given the true feature
support and the estimated p-values at pre-specified significance level
thresholds. eval_testing_err()
evaluates the various testing error
metrics for each experimental replicate separately.
summarize_testing_err()
summarizes the various testing error metrics
across experimental replicates.
eval_testing_err(
fit_results,
vary_params = NULL,
nested_cols = NULL,
truth_col,
pval_col = NULL,
group_cols = NULL,
metrics = NULL,
alphas = 0.05,
na_rm = FALSE
)
summarize_testing_err(
fit_results,
vary_params = NULL,
nested_cols = NULL,
truth_col,
pval_col = NULL,
group_cols = NULL,
metrics = NULL,
alphas = 0.05,
na_rm = FALSE,
summary_funs = c("mean", "median", "min", "max", "sd", "raw"),
custom_summary_funs = NULL,
eval_id = "testing_err"
)
fit_results |
A tibble, as returned by |
vary_params |
A vector of |
nested_cols |
(Optional) A character string or vector specifying the
name of the column(s) in |
truth_col |
A character string identifying the column in
|
pval_col |
A character string identifying the column in
|
group_cols |
(Optional) A character string or vector specifying the column(s) to group rows by before evaluating metrics. This is useful for assessing within-group metrics. |
metrics |
A |
alphas |
Vector of significance levels at which to evaluate
the various metrics. Default is |
na_rm |
A |
summary_funs |
Character vector specifying how to summarize evaluation metrics. Must choose from a built-in library of summary functions - elements of the vector must be one of "mean", "median", "min", "max", "sd", "raw". |
custom_summary_funs |
Named list of custom functions to summarize results. Names in the list should correspond to the name of the summary function. Values in the list should be a function that takes in one argument, that being the values of the evaluated metrics. |
eval_id |
Character string. ID to be used as a suffix when naming result
columns. Default |
The output of eval_testing_err()
is a tibble
with the following
columns:
Replicate ID.
Name of DGP.
Name of Method.
Level of significance.
Name of the evaluation metric.
Value of the evaluation metric.
as well as any columns specified by group_cols
and vary_params
.
The output of summarize_testing_err()
is a grouped tibble
containing both identifying information and the evaluation results
aggregated over experimental replicates. Specifically, the identifier columns
include .dgp_name
, .method_name
, any columns specified by
group_cols
and vary_params
, and .metric
. In addition,
there are results columns corresponding to the requested statistics in
summary_funs
and custom_summary_funs
. These columns end in the
suffix specified by eval_id
.
Other inference_funs:
eval_reject_prob()
,
eval_testing_curve_funs
,
plot_reject_prob()
,
plot_testing_curve()
,
plot_testing_err()
# generate example fit_results data for an inference problem
fit_results <- tibble::tibble(
.rep = rep(1:2, times = 2),
.dgp_name = c("DGP1", "DGP1", "DGP2", "DGP2"),
.method_name = c("Method"),
feature_info = lapply(
1:4,
FUN = function(i) {
tibble::tibble(
# feature names
feature = c("featureA", "featureB", "featureC"),
# true feature support
true_support = c(TRUE, FALSE, TRUE),
# estimated p-values
pval = 10^(sample(-3:0, 3, replace = TRUE))
)
}
)
)
# evaluate feature selection (using all default metrics and alpha = 0.05) for each replicate
eval_results <- eval_testing_err(
fit_results,
nested_cols = "feature_info",
truth_col = "true_support",
pval_col = "pval"
)
# summarize feature selection error (using all default metric and alpha = 0.05) across replicates
eval_results_summary <- summarize_testing_err(
fit_results,
nested_cols = "feature_info",
truth_col = "true_support",
pval_col = "pval"
)
# evaluate/summarize feature selection (at alpha = 0.05) using specific yardstick metrics
metrics <- yardstick::metric_set(yardstick::sens, yardstick::spec)
eval_results <- eval_testing_err(
fit_results,
nested_cols = "feature_info",
truth_col = "true_support",
pval_col = "pval",
metrics = metrics
)
eval_results_summary <- summarize_testing_err(
fit_results,
nested_cols = "feature_info",
truth_col = "true_support",
pval_col = "pval",
metrics = metrics
)
# can evaluate/summarize feature selection at multiple values of alpha
eval_results <- eval_testing_err(
fit_results,
nested_cols = "feature_info",
truth_col = "true_support",
pval_col = "pval",
alphas = c(0.05, 0.1)
)
eval_results_summary <- summarize_testing_err(
fit_results,
nested_cols = "feature_info",
truth_col = "true_support",
pval_col = "pval",
alphas = c(0.05, 0.1)
)
# summarize feature selection (at alpha = 0.05) using specific summary metric
range_fun <- function(x) return(max(x) - min(x))
eval_results_summary <- summarize_testing_err(
fit_results,
nested_cols = "feature_info",
truth_col = "true_support",
pval_col = "pval",
custom_summary_funs = list(range_testing_err = range_fun)
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.