compare_foot: Compare Football Models using Various Metrics

View source: R/compare_foot.R

compare_footR Documentation

Compare Football Models using Various Metrics

Description

Compares multiple football models or directly provided probability matrices based on specified metrics (accuracy, Brier score, ranked probability score, Pseudo R^2, average coverage probability), using a test dataset. Additionally, computes the confusion matrices. The function returns an object of class compareFoot.

Usage

compare_foot(
  source,
  test_data,
  metric = c("accuracy", "brier", "ACP", "pseudoR2", "RPS"),
  conf_matrix = FALSE
)

Arguments

source

A named list containing either:

  • Fitted model objects (of class stanFoot, CmdStanFit, stanfit), each representing a football model.

  • Matrices where each matrix contains the estimated probabilities for "Home Win," "Draw," and "Away Win" in its columns.

test_data

A data frame containing the test dataset, with columns:

  • home_team: Home team's name (character string).

  • away_team: Away team's name (character string).

  • home_goals: Goals scored by the home team (integer >= 0).

  • away_goals: Goals scored by the away team (integer >= 0).

metric

A character vector specifying the metrics to use for comparison. Options are:

  • "accuracy": Computes the accuracy of each model.

  • "brier": Computes the Brier score of each model.

  • "RPS": Computes the ranked probability score (RPS) for each model.

  • "ACP": Computes the average coverage probability (ACP) for each model.

  • "pseudoR2": Computes the Pseudo R^2, defined as the geometric mean of the probabilities assigned to the actual results.

Default is c("accuracy", "brier", "ACP", "pseudoR2", "RPS"), computing the specified metrics.

conf_matrix

A logical value indicating whether to generate a confusion matrix comparing predicted outcomes against actual outcomes for each model or probability matrix. Default is FALSE.

Details

The function extracts predictions from each model or directly uses the provided probability matrices and computes the chosen metrics on the test dataset. It also possible to compute confusion matrices.

Value

An object of class compare_foot_output, which is a list containing:

  • metrics: A data frame containing the metric values for each model or probability matrix.

  • confusion_matrix: Confusion matrices for each model or probability matrix.

Author(s)

Roberto Macrì Demartino roberto.macridemartino@deams.units.it

Examples

## Not run: 
if (instantiate::stan_cmdstan_exists()) {
  library(dplyr)

  data("italy")
  italy_2000 <- italy %>%
    dplyr::select(Season, home, visitor, hgoal, vgoal) %>%
    dplyr::filter(Season == "2000")

  colnames(italy_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals")

  # Example with fitted models
  fit_1 <- stan_foot(
    data = italy_2000,
    model = "double_pois", predict = 18
  ) # Double Poisson model
  fit_2 <- stan_foot(
    data = italy_2000,
    model = "biv_pois", predict = 18
  ) # Bivariate Poisson model

  italy_2000_test <- italy_2000[289:306, ]


  compare_results_models <- compare_foot(
    source = list(
      double_poisson = fit_1,
      bivariate_poisson = fit_2
    ),
    test_data = italy_2000_test,
    metric = c("accuracy", "brier", "ACP", "pseudoR2", "RPS"),
    conf_matrix = TRUE
  )

  print(compare_results_models)


  # Example with probability matrices

  home_team <- c(
    "AC Milan", "Inter", "Juventus", "AS Roma", "Napoli",
    "Lazio", "Atalanta", "Fiorentina", "Torino", "Sassuolo", "Udinese"
  )

  away_team <- c(
    "Juventus", "Napoli", "Inter", "Atalanta", "Lazio",
    "AC Milan", "Sassuolo", "Torino", "Fiorentina", "Udinese", "AS Roma"
  )

  # Home and Away goals based on given data
  home_goals <- c(2, 0, 2, 2, 3, 1, 4, 2, 1, 1, 2)
  away_goals <- c(1, 0, 1, 3, 2, 1, 1, 2, 1, 1, 2)

  # Combine into a data frame
  test_data <- data.frame(home_team, away_team, home_goals, away_goals)

  # Define the data for each column
  pW <- c(0.51, 0.45, 0.48, 0.53, 0.56, 0.39, 0.52, 0.55, 0.61, 0.37, 0.35)
  pD <- c(0.27, 0.25, 0.31, 0.18, 0.23, 0.30, 0.24, 0.26, 0.18, 0.19, 0.22)
  pL <- c(0.22, 0.30, 0.21, 0.29, 0.21, 0.31, 0.24, 0.19, 0.21, 0.44, 0.43)

  # Create the data frame table_prob
  table_prob <- data.frame(pW, pD, pL)
  matrix_prob <- as.matrix(table_prob)

  # Use compare_foot function
  compare_results_matrices <- compare_foot(
    source = list(matrix_1 = matrix_prob),
    test_data = test_data,
    metric = c("accuracy", "brier", "pseudoR2", "ACP", "RPS")
  )
  # Print the results
  print(compare_results_matrices)
}

## End(Not run)

LeoEgidi/footBayes documentation built on June 2, 2025, 11:32 a.m.