knitr::opts_chunk$set(
  collapse = TRUE,
  message = FALSE, 
  warning = FALSE, 
  error = TRUE,
  comment = "#>"
)
library(HockeyModel)

Hockey Model Description

This season is the first season of my hockey model, currently lacking a cool name. The model is very simple, using Dixon-Coles' (DC) method for predicting the outcome of sports games. DC predictions were first described for soccer (football) and are commonly used in that domain.

The method uses strength of scoring and defense (plus a home ice advantage term) to predict a expected goals for away and home teams, and then modifies the results to enhance the prevalance of low-scoring games. Using a poisson distribution, the density of expected total goals for home and away can be summed to produce a win percent. I then apply a factor to enhance the number of tie games, as the model under-predicts ties since it doesn't understand the tendancy to play for the loser point.

Model Details

Much of the groundwork for this model was by OpisThokonta in his 2014 series on Dixon-Coles for his blog. For each team, the attack and defense parameters are compared to the goals scored and allowed against each other team in each game. Optimization of these attack and defense parameters produces values that pass the smell test:

knitr::include_graphics("https://github.com/pbulsink/HockeyModel/raw/master/prediction_results/graphics/current_rating.png")

In addition, a home team enhancement and low-scoring parameters are produced. More information on the code particulars can be seen in my blog from a few years ago. Topics such as past game weights and low-scoring adjustments are covered there.

The attack, defense and home parameters are blended together to produce a matrix of outcome probabilities, with the odds that a specific score (e.g. 3-2 for the home team) will occur. The odds of a home team win, then, are the sum of all the odd shwere the home team has a higher score. Similar simple sums show us odds of away win or a 'draw'.

Draws

The model tends to inherently under-predict draws. Improving this (by a similar method as low scoring adjustments) is in the plans, but not being actively developed. In the mean time, draws are artificially enhanced by a simple ratio adjustment, bringing draws towards an average probability of draws in the last year. Of course, some games will still have higher or lower probabilities, but this adjustment brings odds of a draw from ~0.18 to ~0.23.

Draws are predicted without opinion to the outcome. However, games don't end in a tie. The daily predictions that are posted show instead an odds that the home or away teams will win in OT/SO. These odds are found by splitting the draw odds by the ratio of Home to Away win odds. Simply, if a draw was predicted at 0.2 odds, the home team expected win was 0.5, and away win was 0.3, then the home ot/so win would be 0.2 x (0.5/0.8) = r 0.2*(0.5/0.8), and the away odds the remainder (r 0.2 * (0.3/0.8)).

Predictions

Each game is predicted using the model, producing a simple prediction output.

todayOdds(today = getSeasonStartDate())

Each game remaining in the season is fed into a Monte-Carlo simulation run r 1e5 times, with a result (home win, home OT, home SO, away SO, away OT, away win) generated by a random number generator for each game in each season repeated. At the end of each season, the total points produced, whether or not a team made the playoffs, and presidents' trophy winners are calculated. These are averaged over all of the simulations to produce the daily predictions, which are posted on Twitter.

knitr::include_graphics("https://pbs.twimg.com/media/DwEfH3EV4AA6Ct0.jpg:large")

After the daily predictions are posted, the model moves through each game that day posting the expected goals plots for the two teams. The odds of each number of goals is the horizontal or vertical sum of the row/column in the matrix generated by the prediction engine.

knitr::include_graphics("https://pbs.twimg.com/media/DwZvRrOV4AA7b57.jpg:large")

Periodically, the Twitter bot also plots pace plots, showing how each team is doing relative to their original prediction.

knitr::include_graphics("https://pbs.twimg.com/media/Dv4SOCCUYAAW2Z9.jpg:large")

Performance

As of Jan 03, 2018, the model produced a Log Loss of 0.6888 and accuracy of 59.0 %. These are results from 614 games (approximately 1/2 season).

Code Availability

The code for all of this is available on GitHub.



pbulsink/HockeyModel documentation built on Dec. 16, 2024, 8:03 a.m.