  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"

This package is used to analyze Serie A soccer (Calcio) data. It creates an accessible R data-frame with information about match results, as well as team stats, Elo ratings, and overall standings. This data-frame is used to generate visualizations on a Shiny App:

Source Data

The data is sourced from which contains the results of all Serie A match since the 2013/14 season. The data is extracted using Ruby with the sportdb gem. Running this will create a local SQLite database sport.db that we can use to read into R.

    source_data <- dao$new()
    filter_na_cols <- function(df) df[,purrr::map_lgl(df, ~!all(]
    filter_at_cols <- function(df) df %>% select(-one_of("created_at", "updated_at"))
    source_data$tables %>% 
        purrr::map(filter_na_cols) %>% 
        purrr::map(filter_at_cols) %>% 

Processed Data

The source data is transformed from a set of relational tables to a single data-frame serie_a which contains list columns of data-frame to maintain the relationship of teams and matches to match_days (rounds) and season. Summary data and Elo ratings are also calculated (details below).


Serie A seasons starting from 2013/14 to 2016/17


The number of matches completed so far for each season.


The teams included for each season in Serie A. They change each season as the bottom 3 teams are sent down to Serie B and the top 3 teams from Serie B are promoted.

serie_a %>% select(season, teams) %>% tidyr::unnest(teams) %>% glimpse()

For every season, match_day and team (p_team, for primary team) it shows their score (p_score), their opponents score (o_score), if they were home (p_home) and how many points the p_team earned from the result.

serie_a %>% select(season, results) %>% tidyr::unnest(results) %>% tidyr::unnest(data) %>% glimpse()

For every season, match_day and team (p_team) it shows the teams Elo rating r.

The Elo calculations are mostly based on this site: With k = 20 and a season reverting factor of 0.25.

serie_a %>% select(season, ratings) %>% tidyr::unnest(ratings) %>% tidyr::unnest(data) %>% glimpse()

For every season,match_day and team (p_team) it shows the teams cumulative points, goals_for, goals_against and goal_diff, along with their position in comparison to the other teams.

serie_a %>% select(season, standings) %>% tidyr::unnest(standings) %>% tidyr::unnest(data) %>% glimpse()

lromeo/CalcioR documentation built on May 21, 2019, 7:52 a.m.