best_of_family: Select Best Models by Performance Metrics

View source: R/best_of_family.R

best_of_familyR Documentation

Select Best Models by Performance Metrics

Description

Detects all performance metric columns in a data frame, and for each metric, identifies the best model based on whether a higher or lower value is preferred. The function returns a vector of unique model IDs corresponding to the best models across all detected metrics.

Usage

best_of_family(df)

Arguments

df

A data frame containing model performance results. It must include a column named "model_id" and one or more numeric columns for performance metrics.

Details

The function first detects numeric columns (other than "model_id") as performance metrics. It then uses a predefined mapping to determine the optimal direction for each metric: for example, higher values of auc and aucpr are better, while lower values of logloss, mean_per_class_error, rmse, and mse are preferred. For any metric not in the mapping, the function assumes that lower values indicate better performance.

For each metric, the function identifies the row index that produces the best value according to the corresponding direction (using which.max() or which.min()). It then extracts the model_id from that row. The final result is a unique set of model IDs that represent the best models across all metrics.

Value

An integer or character vector of unique model_id values corresponding to the best model for each performance metric.

Author(s)

E. F. Haghish


HMDA documentation built on April 4, 2025, 6:06 a.m.