knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

NFLpredictions

This package functions that make make it easy to analyze betting lines for NFL games.

Installation

You can install the development version of NFLpredictions from GitHub with:

# install.packages("devtools")
devtools::install_github("przybylee/NFLpredictions")

Example

Suppose we are considering placing a wager on a first round playoff game between the Cardinals and the Rams.

Alt text

First we load the library and collect some data. We will scrape scores from all the games from the 2021 regular season from Pro-Football Reference.

library(NFLpredictions)
G <- scrape_games(ssn = 2021, wk_start = 1, wk_stop = 18)
head(G)
tail(G)

It is important to note that since there were 18 weeks of the 2021 regular season we set wk_stop = 18. If the value of this parameter is larger than the number of weeks played, the function may produce an error. A nice feature of Profootball Reference is that the playoff weeks are a continuation of the week numbers used in the regular season. This means for 2021, week 19 is the wild card round, week 20 is the divisional round, week 21 is the conference championships, etc.

Next, we want to use linear regression to estimate the relative strengths of each team. We can use a simple model, where $Y_j$ is the margin of victory for the home team in game $j$. Then we have

$$ y_j = \tau_{h_j} - \tau_{a_j} + \eta + \varepsilon_j, $$

where $\varepsilon_j$ is observed as the residuals in our model. The indices $h_j$ and $a_j$ indicate the home and away teams respectively. The effect of home field advantage is represented by $\eta$. The matrix equation associated with this is

$$ X\pmb{\beta} = \mathbf{y}, $$

where

$$ \pmb{\beta} = (\eta,\tau_1,...,\tau_{32})' $$

The design matrix $X$ has a row for each game in the data set, where each row has a 1 in the first column, a 1 in column $h_j$, and a $-1$ in column $a_j$. We cannot estimate $\pmb{\beta}$ directly, but we can estimate $\eta$ as well as the difference of any two $\tau$'s using ordinary least squares. This serves as a good predictor for the outcome of a game. Given a data frame of games resembling G, we can produce the appropriate $X$ and $\mathbf{y}$ using the function get_design().

data <- get_design(G)
names(data)

The object data is a list that contains $X$ and $\mathbf{y}$. There is also a list of the team names appearing in the original data set.Now that the data has been sorted, we can analyze the game of interest. For starters, we use point_spread_ols to estimate the Rams' margin of victory.

point_spread_ols(data, "Rams", "Cardinals", a = 0.05)

Notice that since we set a = 0.05, we get the limits of a 95% confidence interval for $\tau_{\text{Rams}} - \tau_{\text{Cardinals}}+\eta$. This is based on the $t$ statistic when we assume $$ \varepsilon_j\:{\sim}\:N(0,\sigma^2). $$ We can also use use this assumption to get win probabilities.

probs <- winprob_ols(data, "Rams", "cardinals")
probs

If we wanted to look at the chances of a team beating the spread, we could use spreadprob_ols().

It is in the interest of the sports book to set the betting lines so each wager will have negative expectation, although they also consider the amount of risk based on where customers are placing their bets. We would like to consider the expected value of each of our wagers. In the long run, we could benefit from only placing wagers with positive expectation. To see the expected value on wagering on either of the moneylines, we use eML_ols().

eML_ols(data, "Rams", "Cardinals", wager = 40, hBL = -170, aBL = 150)

We can also consider the expected value of betting on either team to beat the spread.

eSpread_ols(data, "Rams", "Cardinals", hspread = -3.5, wager = 40)

If left null, the value for aspread, the home teams spread is assumed to be the opposit of hspread. The default values of hBL and aBL are -110, since that is usually the standard betting line for something with equal odds after vigorish is applied.

After considering the probabities, betting on the Cardinals moneyline seems to have the best payout relative to our estimated probability of this actually happening. This is still the least likely wager to pay off.

Functions to come



przybylee/NFLpredictions documentation built on Feb. 9, 2025, 9:22 p.m.