In jtfeld/EloOptimized: Optimized Elo Rating Method for Obtaining Dominance Ranks

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

A familiar example may be useful for understanding the different fitting options available in the EloOptimized package. We included in the package a data frame with game results from the 1995-1996 NBA regular season. We used NBA data because games can't end in a tie, teams play all other teams over the course of the season, and there's a manageable set of teams for visualization purposes. During the 1995-1996 season, the Chicago Bulls went 72-10 (until recently the best record in NBA history), had a winning record against all but one team (the Pacers, against whom they went 2-2), and went on to win the championship. Therefore we would expect a reasonable ranking method to assign them the highest rank score by the end of the season. First we'll use a traditional Elo ranking approach to assign Elo scores using eloratingfixed(), and then we'll show how maximum likelihood fitting changes ranking.

Traditional Elo scores with default parameter values

First, let's load the packages we'll need (we're using a couple tidyverse packages for post-fitting data manipulation and visualization):

library(EloOptimized)
library(dplyr)
library(ggplot2)

Next, using the nba dataset that's included with the EloOptimized package, let's calculate traditional Elo scores for each team. Note that you don't have to supply pres_data if all individuals (in this case teams) are present for the entire period of interest.

res = eloratingfixed(agon_data = nba)

Because we did not specify a k value, the function used the default value of k =100. We find that the prediction accuracy, the proportion of correctly predicted contests, is 0.636.

We can access the data frame of Elo scores using res$elo, and use ggplot2 to quickly visualize each team's trajectory.

head(res$elo)

res$elo %>%
  rename(Team = Individual) %>%
  ggplot(aes(x = Date, y = Elo, color = Team)) + 
  geom_line()

There's a lot of noise in that plot, but if you look closely you'll notice that Chicago doesn't end the season with the highest Elo score. There were 29 teams in the NBA in the 95-96 season, so we can check final Elo scores directly by sorting the the last 29 rows of the Elo scores data frame (we're removing a couple columns to make the output easier to read. See eloratingfixed() documentation for descriptions of each column):

tail(res$elo, 29) %>% arrange(-Elo) %>% select(-ExpNumBeaten, -JenksEloCardinal)

In fact, Indiana ends up with the highest Elo score on the final day. It turns out that they finished the season 8-1 and beat the Bulls in the second to last game of the season. This suggests that the default K parameter of 100 (which determines the degree to which Elo scores change after each contest) is not the best choice. However, it's not obvious what the K parameter should be set to. Luckily, the EloOptimized package implements a maximum likelihood approach for fitting the K parameter, so you don't have to make an arbitrary decision.

ML optimized K value

To fit the K parameter, we use eloratingopt():

res2 = eloratingopt(agon_data = nba)

This takes longer than calculating traditional Elo scores because we're optimizing the K parameter before generating Elo scores. Note that we're not optimizing K by trying to maximize the prediction accuracy, we're instead maximizing the likelihood of each contest result according to the sigmoid probability function defined in Foerster, Franz et al., 2016 [^1]. It turns out that the optimized value of K is closer to 20, and this value improves our prediction accuracy to 0.663. Let's take a look at our results:

res2$elo %>%
  rename(Team = Individual) %>%
  ggplot(aes(x = Date, y = Elo, color = Team)) + 
  geom_line()

Because eloratingopt() implements a burn-in period of 100 interactions, during which K is fixed at 100, Elo ratings at the start of the season remain volatile, followed by more stable Elo scores over the course of the season. It looks like Chicago's on top this time, but let's double check:

tail(res2$elo, 29) %>% arrange(-Elo) %>% select(-ExpNumBeaten, -JenksEloCardinal)

With K optimized, Chicago ends the season with the highest Elo score. The final Elo scores also end up corresponding more closely to the final standings, although they don't match exactly. For example, the Utah Jazz finished the season with the fifth best record but the eighth best Elo score, which makes sense given their 5-7 record over their last 12 games. But what about that initial period of volatility corresponding to the burn-in period? Can we do better? The EloOptimized package also allows you to fit individual initial Elo scores (in which case no burn-in period is implemented). Let's see what happens when we do that with the NBA data.

ML optimized K and initial Elo scores

To fit both the K parameter and initial Elo scores, we use eloratingopt() with fit_init_elo = TRUE:

res3 = eloratingopt(agon_data = nba, fit_init_elo = TRUE)

This takes quite a bit longer, because we're going from optimizing one parameter (K) to optimizing 30 parameters (K plus the 29 initial Elo scores). Surprisingly, the optimized value of K is 0 once you've fit initial Elo scores, meaning that Elo scores don't change at all in response to wins and losses. We might take this to mean that, on average, teams' competitive ability doesn't change over the course of the season. Note that fitting initial Elo scores with the included male chimpanzee data results in considerably different patterns, so K = 0 is not an inevitable result of fitting initial Elo scores. Note that this time around the prediction accuracy rises to 0.7. Let's take a look at our results:

res3$elo %>%
  rename(Team = Individual) %>%
  ggplot(aes(x = Date, y = Elo, color = Team)) + 
  geom_line()

Now Chicago's dominance is more strongly reflected in their final Elo score relative to others. It's easier to see this in the EloCardinal scores, which essentially represents the cumulative probability of winning against all other individuals (i.e. teams) scaled by the total number of other individuals/teams.

tail(res3$elo, 29) %>% arrange(-Elo) %>% select(-ExpNumBeaten, -JenksEloCardinal)

Because each team plays the same number of games, and schedules are fairly balanced, the Elo scores now reflect each team's overall record. These scores are not useful as dynamic records of team performance, but would be useful for calculating winning probabilities according to the sigmoid equation from Foerster, Franz et al., 2016. And in an animal system where individuals enter and leave the hierarchy through maturation and death, respectively, and where individuals tend to enter towards the bottom of the hierarchy, such a result would reflect a queuing system, as argued in the linked article.

[^1]: Currently the only option available in the eloratingopt() function is to use the sigmoid function from the linked paper, but you can use the the pnorm-based method from the EloRating package in the eloratingfixed() function (to make comparison with that package easier).

jtfeld/EloOptimized documentation built on May 15, 2023, 4:23 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com