prep.data: Prep data on Selection Sunday

View source: R/prep.data.R

prep.dataR Documentation

Prep data on Selection Sunday

Description

This function should (hopefully) be our one-stop shop for updating data in the future. It downloads the prediction file from 538, which includes all of the information we need to construct the bracket. The one thing that still requires manual intervention is specifying which region has the Nos. 1, 2, 3 and 4 overall seeds. 538 provides ESPN team ID numbers, so we don't have to map those. But we do have to map the population distribution names. Right now, the mapper runs fully automatically, but I can't guarantee it will always be correct in the future. We can provide functionality for manual override in the future.

Usage

prep.data(
  year,
  league = c("men", "women"),
  skip.game.results = FALSE,
  verbose = TRUE
)

Arguments

year

integer

league

"men" or "women"

skip.game.results

logical, should scraping game results be script (for example, maybe you've already done this)

verbose

logical, should progress be printed to console? (recommended because this shows the team mappings that were automatically made, and they may be incorrect)

region.rank

a named list giving the overall ranking of the No. 1 seed within each region. Names must exactly match the region names in the 538 data file, so take a look at that first.

skip.population.distribution

logical, should scraping population distribution be skipped? (for example, maybe it's not available yet)

Value

a list of data to save into the package (don't forget to update documentation!)

Author(s)

saberpowers

Examples

data.women.2023 = prep.data(
  year = 2023,
  league = "women",
  region.rank = c("Greenville 1" = 1, "Greenville 2" = 2, "Seattle 3" = 3, "Seattle 4" = 4),
  skip.population.distribution = FALSE,
  skip.game.results = TRUE
)
bracket.women.2023 = data.women.2023$bracket
pred.538.women.2023 = data.women.2023$pred.538
pred.pop.women.2023 = data.women.2023$pred.pop
teams.women = data.women.2023$teams
save(bracket.women.2023, file = "data/bracket.women.2023.RData")
save(pred.538.women.2023, file = "data/pred.538.women.2023.RData")
save(pred.pop.women.2023, file = "data/pred.pop.women.2023.RData")
save(teams.women, file = "data/teams.women.RData")

data.men.2023 = prep.data(
  year = 2023,
  league = "men",
  region.rank = c("South" = 1, "Midwest" = 2, "West" = 3, "East" = 4),
  skip.population.distribution = FALSE,
  skip.game.results = TRUE
)
bracket.men.2023 = data.men.2023$bracket
pred.538.men.2023 = data.men.2023$pred.538
pred.pop.men.2023 = data.men.2023$pred.pop
teams.men = data.men.2023$teams
save(bracket.men.2023, file = "data/bracket.men.2023.RData")
save(pred.538.men.2023, file = "data/pred.538.men.2023.RData")
save(pred.pop.men.2023, file = "data/pred.pop.men.2023.RData")
save(teams.men, file = "data/teams.men.RData")


elishayer/mRchmadness documentation built on March 27, 2024, 2:11 p.m.