README.md

README for football-data (fdata) repo

2021-01-26 20:11:14

Most recent dates - summary

# # A tibble: 4 x 3
#   datee      Time  Div  
#   <date>     <fct> <fct>
# 1 2021-01-25 20:15 P1   
# 2 2021-01-25 20:00 I2   
# 3 2021-01-25 20:00 SP1  
# 4 2021-01-25 20:00 SP2

Paged table

{"columns":[{"label":["datee"],"name":[1],"type":["date"],"align":["right"]},{"label":["Time"],"name":[2],"type":["fct"],"align":["left"]},{"label":["Div"],"name":[3],"type":["fct"],"align":["left"]},{"label":["season_start"],"name":[4],"type":["int"],"align":["right"]},{"label":["season"],"name":[5],"type":["int"],"align":["right"]},{"label":["urll"],"name":[6],"type":["fct"],"align":["left"]},{"label":["HomeTeam"],"name":[7],"type":["fct"],"align":["left"]},{"label":["AwayTeam"],"name":[8],"type":["fct"],"align":["left"]},{"label":["FTHG"],"name":[9],"type":["int"],"align":["right"]},{"label":["FTAG"],"name":[10],"type":["int"],"align":["right"]}],"data":[{"1":"2021-01-25","2":"20:15","3":"P1","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Farense","8":"Porto","9":"0","10":"1"},{"1":"2021-01-25","2":"20:00","3":"I2","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Brescia","8":"Monza","9":"0","10":"1"},{"1":"2021-01-25","2":"20:00","3":"SP1","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Ath Bilbao","8":"Getafe","9":"5","10":"1"},{"1":"2021-01-25","2":"20:00","3":"SP2","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Cartagena","8":"Mirandes","9":"0","10":"2"},{"1":"2021-01-25","2":"19:45","3":"F2","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Sochaux","8":"Toulouse","9":"0","10":"1"},{"1":"2021-01-25","2":"17:00","3":"P1","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Benfica","8":"Nacional","9":"1","10":"1"},{"1":"2021-01-25","2":"16:30","3":"P1","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Rio Ave","8":"Santa Clara","9":"1","10":"2"},{"1":"2021-01-25","2":"16:00","3":"T1","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Alanyaspor","8":"Ankaragucu","9":"4","10":"3"},{"1":"2021-01-25","2":"16:00","3":"T1","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Fenerbahce","8":"Kayserispor","9":"3","10":"0"},{"1":"2021-01-25","2":"14:30","3":"P1","4":"20","5":"2021","6":"https://www.football-data.co.uk/mmz4281/2021/data.zip","7":"Belenenses","8":"Tondela","9":"2","10":"0"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}}
Table is in reverse chronological order.

Missing data

(sampled) Missing football-data data
(fdata)

Most recent dates - details

# Most recent match date: 2021-01-25
# Most recent match dates (desc) for the top 5 leagues:
# # A tibble: 5 x 2
#   Div   most_recent_match
#   <fct> <date>           
# 1 SP1   2021-01-25       
# 2 D1    2021-01-24       
# 3 F1    2021-01-24       
# 4 I1    2021-01-24       
# 5 E0    2021-01-23
# Most recent match dates (asc) for all leagues:
# # A tibble: 22 x 2
#    Div   datee     
#    <fct> <date>    
#  1 SC2   2021-01-02
#  2 SC3   2021-01-02
#  3 E0    2021-01-23
#  4 E2    2021-01-23
#  5 E3    2021-01-23
#  6 EC    2021-01-23
#  7 SC0   2021-01-23
#  8 SC1   2021-01-23
#  9 B1    2021-01-24
# 10 D1    2021-01-24
# # … with 12 more rows

Objectives

The goal of this repo is

Details

File structure

The raw data files from football-data , stored in a local cache (as pins), are organized as follows:

# ./data/pins
# └── local
#     ├── 1415
#     ├── 1516
#     ├── 1617
#     ├── 1718
#     ├── 1819
#     ├── 1920
#     ├── 2021
#     ├── data.txt
#     └── data.txt.lock

Files purpose

| File | Purpose | | ------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | run.sh | Shell script to run run.R in a persistent background process. Works on Unix-like systems. | | run.R | R script to run tar_make() or tar_make_clustermq() (uncomment the function of your choice.) | | _targets.R | The special R script that declares the targets pipeline. See tar_script() for details. | | R/functions.R | An R script with user-defined functions. | | README.Rmd | An R Markdown report summarizing the results of the analysis. For more information on how to include R Markdown reports as reproducible components of the pipeline, see the tar_render() function from the tarchetypes package and the literate programming chapter of the manual. |

How to run

  1. Run the targets pipeline by either running run.R or run.sh. (The latter is for Unix-like systems only).
  2. View the validation results in the output README.md file.
  3. Make changes to the R code, rerun the pipeline, and watch targets skip steps that are already up to date.

Pipeline

How to access

Python

R

# pins/0.4.5/092094df6204c08a37248b1d5202a306/pins/views/data/index.html
# Registering
library(pins)
library(reticulate)
# get a _local_ set of csvs for one season (2014-15) and all leagues
pin_get("1415", board = "local") %>% str(max.level = 1)
#  chr [1:22] "/home/runner/.cache/pins/local/1415/B1.csv" ...

# get a _remote_ set of csvs for one season (2014-15) and all leagues
board_register("github", 
  repo = "JohnGavin/fdata", 
  branch = 'master', 
  # TODO: revert from GITHUB_TOKEN to GITHUB_PAT
  token = Sys.getenv(c('GITHUB_PAT', 'GITHUB_TOKEN')[2])
)
# get the 2014-15 season for all leagues.
pin_get("data/pins/local/1415", board = "github") %>% 
  map_dfr(read_csv, col_types = cols()) %>% 
  type.convert() %>% 
  head(c(5, 5))
# # A tibble: 5 x 5
#   Div   Date     HomeTeam         AwayTeam     FTHG
#   <fct> <fct>    <fct>            <fct>       <int>
# 1 B1    25/07/14 Standard         Charleroi       3
# 2 B1    26/07/14 Cercle Brugge    Gent            0
# 3 B1    26/07/14 Lierse           Oostende        2
# 4 B1    26/07/14 Waasland-Beveren Club Brugge     0
# 5 B1    26/07/14 Westerlo         Lokeren         1
# pin_info("data/pins/local/1415", board = "github")
# https://raw.githubusercontent.com/JohnGavin/fdata/master/data/pins/1415/data.txt
# pin_find("1415", board = "github")

# Sharing
# Once your collaborators gain access to the repo, they can follow the same steps to register the same GitHub board to allow them to upload and download pins with ease.

# Pinning
# pin(iris, description = "The iris data set", board = "github")
# pin(mtcars, description = "The motor trend cars data set", board = "github")

# Discovering
# pin_get("iris", board = "github")
pin_find("football", board = "github")
# # A tibble: 0 x 4
# # … with 4 variables: name <chr>, description <chr>, type <chr>, board <chr>
pin_find("odds", board = "github", extended = TRUE)
# # A tibble: 0 x 4
# # … with 4 variables: name <chr>, description <chr>, type <chr>, board <chr>
# pin_info("mtcars", board = "github")

# GitHub repo only supports files under 25MB in size 
#   (100MB in theory but 
#   there is additional overhead when using the GitHub API). 
# to support large files, pins makes use of GitHub release files. 
#   pins will create a new GitHub release file for that particular pin
#   The only noticeable change is new releases being created in your repo

# board_register("rsconnect", server = "{{server_name}}")
# Retrieve Pin
# {{retrieve_pin}}

Notes

References

R information

# [1] "Tue Jan 26 20:11:26 2021"

Session info

# R version 4.0.3 (2020-10-10)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 18.04.5 LTS
# 
# Matrix products: default
# BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
# LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
# 
# locale:
#  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
# [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
# 
# attached base packages:
# [1] stats     graphics  grDevices datasets  utils     methods   base     
# 
# other attached packages:
#  [1] magrittr_2.0.1     glue_1.4.2         pins_0.4.5         gt_0.2.2          
#  [5] lubridate_1.7.9.2  reticulate_1.18    details_0.2.1      visdat_0.5.3      
#  [9] fs_1.5.0           ggplot2_3.3.3      tidyr_1.1.2        purrr_0.3.4       
# [13] stringr_1.4.0      rmarkdown_2.6      tibble_3.0.5       readr_1.4.0       
# [17] dplyr_1.0.3        renv_0.12.5        tarchetypes_0.0.1  targets_0.0.2.9000
# 
# loaded via a namespace (and not attached):
#  [1] Rcpp_1.0.6        lattice_0.20-41   png_0.1-7         ps_1.5.0         
#  [5] assertthat_0.2.1  rprojroot_2.0.2   digest_0.6.27     utf8_1.1.4       
#  [9] R6_2.5.0          backports_1.2.1   evaluate_0.14     highr_0.8        
# [13] httr_1.4.2        pillar_1.4.7      rlang_0.4.10      curl_4.3         
# [17] data.table_1.13.6 callr_3.5.1       Matrix_1.3-2      desc_1.2.0       
# [21] labeling_0.4.2    igraph_1.2.6      munsell_0.5.0     compiler_4.0.3   
# [25] xfun_0.20         pkgconfig_2.0.3   clipr_0.7.1       htmltools_0.5.1  
# [29] tidyselect_1.1.0  codetools_0.2-18  fansi_0.4.2       crayon_1.3.4     
# [33] withr_2.4.0       rappdirs_0.3.1    grid_4.0.3        jsonlite_1.7.2   
# [37] gtable_0.3.0      lifecycle_0.2.0   scales_1.1.1      zip_2.1.1        
# [41] cli_2.2.0         stringi_1.5.3     farver_2.0.3      xml2_1.3.2       
# [45] ellipsis_0.3.1    filelock_1.0.2    generics_0.1.0    vctrs_0.3.6      
# [49] tools_4.0.3       hms_1.0.0         processx_3.4.5    yaml_2.2.1       
# [53] colorspace_2.0-0  knitr_1.30

TODOs

README parameters

html { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif; } #hutcwxwyoc .gt_table { display: table; border-collapse: collapse; margin-left: auto; margin-right: auto; color: #333333; font-size: x-smaller; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #hutcwxwyoc .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hutcwxwyoc .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #hutcwxwyoc .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 0; padding-bottom: 4px; border-top-color: #FFFFFF; border-top-width: 0; } #hutcwxwyoc .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hutcwxwyoc .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hutcwxwyoc .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: x-smaller; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #hutcwxwyoc .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: x-smaller; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #hutcwxwyoc .gt_column_spanner_outer:first-child { padding-left: 0; } #hutcwxwyoc .gt_column_spanner_outer:last-child { padding-right: 0; } #hutcwxwyoc .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; overflow-x: hidden; display: inline-block; width: 100%; } #hutcwxwyoc .gt_group_heading { padding: 8px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; } #hutcwxwyoc .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #hutcwxwyoc .gt_from_md > :first-child { margin-top: 0; } #hutcwxwyoc .gt_from_md > :last-child { margin-bottom: 0; } #hutcwxwyoc .gt_row { padding-top: 3px; padding-bottom: 3px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #hutcwxwyoc .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 12px; } #hutcwxwyoc .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hutcwxwyoc .gt_first_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; } #hutcwxwyoc .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hutcwxwyoc .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #hutcwxwyoc .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #hutcwxwyoc .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hutcwxwyoc .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hutcwxwyoc .gt_footnote { margin: 0px; font-size: 90%; padding: 4px; } #hutcwxwyoc .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hutcwxwyoc .gt_sourcenote { font-size: 90%; padding: 4px; } #hutcwxwyoc .gt_left { text-align: left; } #hutcwxwyoc .gt_center { text-align: center; } #hutcwxwyoc .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #hutcwxwyoc .gt_font_normal { font-weight: normal; } #hutcwxwyoc .gt_font_bold { font-weight: bold; } #hutcwxwyoc .gt_font_italic { font-style: italic; } #hutcwxwyoc .gt_super { font-size: 65%; } #hutcwxwyoc .gt_footnote_marks { font-style: italic; font-size: 65%; }
Rmarkdown yaml script parameters Passed in via command line. Rmarkdown parameters value read cached results TRUE trade offset days 1d 0H 0M 0S title README for football-data (`fdata`) repo date 2021-01-26 20:11:14 subtitle {{ date | date('MMMM Do') }} - {{ date | date('add', 5, 'days') | date('Do') }} labels Meeting 💬 pins path data/pins TODO: Remove params moved to drake plan.

Code metrics

Outdated

Validate

Glimpse

# tar_glimpse needs visnetwork package
# tar_glimpse() # (allow = starts_with('h'))
# alt cmd g how targets co-depend - relationships via static code analysis
# details(summary = 'tar_glimpse plots', imgur = FALSE)

Network

# tar_visnetwork() 
# %>% print() %>% details(summary = 'tar_glimpse plots', imgur = FALSE)

Minor metrics

Targets list

#  [1] "board_name"            "chk_data_recent"       "chk_datee_na"         
#  [4] "create_board"          "divs_dates_recent"     "fdata"                
#  [7] "fdata_change"          "fp_cache"              "gg_dat_miss"          
# [10] "pin_fdata"             "pins_path"             "raw_csv_list"         
# [13] "README"                "season_starts"         "top_divs_dates_recent"

Manifest

Meta

Relationships



JohnGavin/fdata documentation built on Jan. 29, 2021, 1:38 p.m.