In eheinzen/elo: Ranking Teams by Elo Rating and Comparable Methods

library(elo)

Comparison Models

Now that we've explored the Elo setup, we turn our attention to other methodologies implemented in the elo package.

Win/Loss Logistic Regression

The first model computes teams' win percentages, and feeds the differences of percentages into a regression. Including an adjustment using adjust() in the formula also includes that in the model. You could also adjust the intercept for games played on neutral fields by using the neutral() function.

e.winpct <- elo.winpct(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + group(week), data = tournament,
                       subset = points.Home != points.Visitor) # to get rid of ties for now
summary(e.winpct)
rank.teams(e.winpct)
predict(e.winpct, newdata = data.frame(team.Home = "Athletic Armadillos", team.Visitor = "Blundering Baboons", stringsAsFactors = FALSE))

tournament$neutral <- replace(rep(0, nrow(tournament)), 30:35, 1)
summary(elo.winpct(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + neutral(neutral) + group(week),
                   data = tournament, subset = points.Home != points.Visitor))

The models can be built "running", where predictions for the next group of games are made based on past data. Consider using the skip= argument to skip the first few groups (otherwise the model might have trouble converging).

Note that predictions from this object use a model fit on all the data.

e.winpct <- elo.winpct(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + group(week), data = tournament,
                       subset = points.Home != points.Visitor, running = TRUE, skip = 5)
summary(e.winpct)
predict(e.winpct, newdata = data.frame(team.Home = "Athletic Armadillos", team.Visitor = "Blundering Baboons", stringsAsFactors = FALSE)) # the same thing

Logistic Regression

It's also possible to compare teams' skills using logistic regression. This is essentially the Bradley-Terry model. A matrix of dummy variables is constructed, one for each team, where a value of 1 indicates a home team and -1 indicates a visiting team. The intercept then indicates a home-field advantage. To denote games played in a neutral setting (that is, without home-field advantage), use the neutral() function. In short, the intercept will then be set to 1 - neutral(). Including an adjustment using adjust() in the formula also includes that in the model.

results <- elo.glm(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + group(week), data = tournament,
                   subset = points.Home != points.Visitor) # to get rid of ties for now
summary(results)
rank.teams(results)
predict(results, newdata = data.frame(team.Home = "Athletic Armadillos", team.Visitor = "Blundering Baboons", stringsAsFactors = FALSE))

summary(elo.glm(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + neutral(neutral) + group(week),
                data = tournament, subset = points.Home != points.Visitor))

Note that predictions from this object use a model fit on all the data.

results <- elo.glm(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + group(week), data = tournament,
                   subset = points.Home != points.Visitor, running = TRUE, skip = 5)
summary(results)
predict(results, newdata = data.frame(team.Home = "Athletic Armadillos", team.Visitor = "Blundering Baboons", stringsAsFactors = FALSE)) # the same thing

Markov Chain

It's also possible to compare teams' skills using a Markov-chain-based model, as outlined in Kvam and Sokol (2006). In short, imagine a judge who randomly picks one of two teams in a matchup, where the winner gets chosen with probability p (here, for convenience, 'k') and the loser with probability 1-p (1-k). In other words, we assume that the probability that the winning team is better than the losing team given that it won is k, and the probability that the losing team is better than the winning team given that it lost is (1-k). This forms a transition matrix, whose stationary distribution gives a ranking of teams. The differences in ranking are then fed into a logistic regression model to predict win status. Any adjustments made using adjust() are also included in this logistic regression. You could also adjust the intercept for games played on neutral fields by using the neutral() function.

mc <- elo.markovchain(score(points.Home, points.Visitor) ~ team.Home + team.Visitor, data = tournament,
                      subset = points.Home != points.Visitor, k = 0.7)
summary(mc)
rank.teams(mc)
predict(mc, newdata = data.frame(team.Home = "Athletic Armadillos", team.Visitor = "Blundering Baboons", stringsAsFactors = FALSE))
summary(elo.markovchain(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + neutral(neutral),
                        data = tournament, subset = points.Home != points.Visitor, k = 0.7))

These models can also be built "running", where predictions for the next group of games are made based on past data. Consider using the skip= argument to skip the first few groups (otherwise the model might have trouble converging).

Note that predictions from this object use a model fit on all the data.

mc <- elo.markovchain(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + group(week), data = tournament,
                      subset = points.Home != points.Visitor, k = 0.7, running = TRUE, skip = 5)
summary(mc)
predict(mc, newdata = data.frame(team.Home = "Athletic Armadillos", team.Visitor = "Blundering Baboons", stringsAsFactors = FALSE)) # the same thing

A note about LRMC

Note that by assigning probabilities in the right way, this function emits the Logistic Regression Markov Chain model (LRMC). Use the in-formula function k() for this. IMPORTANT: note that k() denotes the probability assigned to the winning team, not the home team (for instance). If rH(x) denotes the probability that the home team is better given that they scored x points more than the visiting team (allowing for x to be negative), then an LRMC model might look something like this:

elo.markovchain(floor(wins.home) ~ team.home + team.visitor + k(ifelse(x > 0, rH(x), 1 - rH(x))))

Why do we use floor() here? This takes care of the odd case where teams tie. In this case, rH(x) < 0.5 because we expected the home team to win by virtue of being home. By default, elo.markovchain() will split any ties down the middle (i.e., 0.5 and 0.5 instead of p and 1-p), which isn't what we want; we want the visiting team to get a larger share than the home team. Telling elo.markovchain() that the visiting team "won" gives the visiting team its whole share of p.

Alternatively, if h denotes a home-field advantage (in terms of score), the model becomes:

elo.markovchain(ifelse(home.points - visitor.points > h, 1, 0) ~ team.home + team.visitor + k(pmax(rH(x), 1 - rH(x))))

In this case, the home team "won" if it scored more than h points more than the visiting team. Since rH(x) > 0.5 if x > h, then pmax() will assign the proper probability to the pseudo-winning team.

Finally, do note that using neutral() isn't sufficient for adjusting for games played on neutral ground, because the adjustment is only taken into account in the logistic regression to produce probabilities, not the building of the transition matrix. Therefore, you'll want to also account for neutral wins/losses in k() as well.

Colley Matrix Method

It's also possible to compare teams' skills using the Colley Matrix method, as outlined in Colley (2002). The coefficients to the Colley matrix formulation gives a ranking of teams. The differences in ranking are then fed into a logistic regession model to predict win status. Here 'k' denotes how convincing a win is; it represents the fraction of the win assigned to the winning team and the fraction of the loss assigned to the losing team. Setting 'k' = 1 emits the bias-free method presented by Colley. Any adjustments made using adjust() are also included in this logistic regression. You could also adjust the intercept for games played on neutral fields by using the neutral() function.

co <- elo.colley(score(points.Home, points.Visitor) ~ team.Home + team.Visitor, data = tournament,
                 subset = points.Home != points.Visitor)
summary(co)
rank.teams(co)
predict(co, newdata = data.frame(team.Home = "Athletic Armadillos", team.Visitor = "Blundering Baboons", stringsAsFactors = FALSE))
summary(elo.colley(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + neutral(neutral),
                   data = tournament, subset = points.Home != points.Visitor))

Note that predictions from this object use a model fit on all the data.

co <- elo.colley(score(points.Home, points.Visitor) ~ team.Home + team.Visitor + group(week), data = tournament,
                      subset = points.Home != points.Visitor, running = TRUE, skip = 5)
summary(co)
predict(co, newdata = data.frame(team.Home = "Athletic Armadillos", team.Visitor = "Blundering Baboons", stringsAsFactors = FALSE)) # the same thing

Modeling Margin of Victory Instead of Wins

elo.glm(), elo.markovchain(), and elo.winpct() all allow for modeling of margins of victory instead of simple win/loss using the mov() function. Note that one must set the family="gaussian" argument to get linear regression instead of logistic regression.

summary(elo.glm(mov(points.Home, points.Visitor) ~ team.Home + team.Visitor, data = tournament,
                family = "gaussian"))

eheinzen/elo documentation built on Oct. 11, 2023, 12:19 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

eheinzen/elo
Ranking Teams by Elo Rating and Comparable Methods

In eheinzen/elo: Ranking Teams by Elo Rating and Comparable Methods

Comparison Models

Win/Loss Logistic Regression

Logistic Regression

Markov Chain

A note about LRMC

Colley Matrix Method

Modeling Margin of Victory Instead of Wins

R Package Documentation

Browse R Packages

We want your feedback!

eheinzen/elo Ranking Teams by Elo Rating and Comparable Methods

In eheinzen/elo: Ranking Teams by Elo Rating and Comparable Methods

Comparison Models

Win/Loss Logistic Regression

Logistic Regression

Markov Chain

A note about LRMC

Colley Matrix Method

Modeling Margin of Victory Instead of Wins

R Package Documentation

Browse R Packages

We want your feedback!

eheinzen/elo
Ranking Teams by Elo Rating and Comparable Methods