knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This vignette will describe comperes
functionality for manipulating (summarising and transforming) competition results (hereafter - results):
We will need the following packages:
library(comperes) library(dplyr) library(rlang)
Example results in long format:
cr_long <- tibble( game = c("a1", "a1", "a1", "a2", "a2", "b1", "b1", "b2"), player = c(1, NA, NA, 1, 2, 2, 1, 2), score = 1:8, season = c(rep("A", 5), rep("B", 3)) ) %>% as_longcr()
Functions discussed in these topics leverage dplyr
's grammar of data manipulation. Only basic knowledge is enough to use them. Also a knowledge of rlang
's quotation mechanism is preferred.
Item summary is understand as some summary measurements (of arbitrary nature) of item (one or more columns) present in data. To compute them, comperes
offers summarise_*()
family of functions in which summary functions should be provided as in dplyr::summarise()
. Basically, they are wrappers for grouped summarise with forced ungrouping, conversion to tibble
and possible adding prefix to summaries. Note that if one of columns in item is a factor with implicit NA
s (present in vector but not in levels), there will be a warning suggesting to add NA
to levels. This is due to group_by()
functionality in dplyr
after 0.8.0 version.
Couple of examples:
cr_long %>% summarise_player(mean_score = mean(score)) cr_long %>% summarise_game(min_score = min(score), max_score = max(score)) cr_long %>% summarise_item("season", sd_score = sd(score))
For convenient transformation of results there are join_*_summary()
family of functions, which compute respective summaries and join them to original data:
cr_long %>% join_item_summary("season", season_mean_score = mean(score)) %>% mutate(score = score - season_mean_score)
For common summary functions comperes
has a list summary_funs
with r length(summary_funs)
quoted expressions to be used with rlang
's unquoting mechanism:
# Use .prefix to add prefix to summary columns cr_long %>% join_player_summary(!!!summary_funs[1:2], .prefix = "player_") %>% join_item_summary("season", !!!summary_funs[1:2], .prefix = "season_")
Head-to-Head value is a summary statistic of direct confrontation between two players. It is assumed that this value can be computed based only on the players' matchups, data of actual participation for ordered pair of players in one game.
To compute matchups, comperes
has get_matchups()
, which returns a widecr
object with all matchups actually present in results (including matchups of players with themselves). Note that missing values in player
column are treated as separate players. It allows operating with games where multiple players' identifiers are not known. However, when computing Head-to-Head values they treated as single player. Example:
get_matchups(cr_long)
Head-to-Head values can be stored in two ways:
tibble
with columns player1
and player2
which identify ordered pair of players, and columns corresponding to Head-to-Head values. Computation is done with h2h_long()
which returns an object of class h2h_long
. Head-to-Head functions are specified as in dplyr
's grammar for results matchups:cr_long %>% h2h_long( abs_diff = mean(abs(score1 - score2)), num_wins = sum(score1 > score2) )
h2h_mat()
which returns an object of class h2h_mat
. Head-to-Head functions are specified as in h2h_long()
:cr_long %>% h2h_mat(sum_score = sum(score1 + score2))
comperes
also offers a list h2h_funs
of r length(h2h_funs)
common Head-to-Head functions as quoted expressions to be used with rlang
's unquoting mechanism:
cr_long %>% h2h_long(!!!h2h_funs)
To compute Head-to-Head for only subset of players or include values for players that are not in the results, use factor player
column. Notes:
fill
argument to replace NA
s in certain columns after computing Head-to-Head values.summarise_item()
, there will be a warning in case of implicit NA
s in factor columns.cr_long_fac <- cr_long %>% mutate(player = factor(player, levels = c(1, 2, 3))) cr_long_fac %>% h2h_long(abs_diff = mean(abs(score1 - score2)), fill = list(abs_diff = -100)) cr_long_fac %>% h2h_mat(mean(abs(score1 - score2)), fill = -100)
To convert between long and matrix formats of Head-to-Head values, comperes
has to_h2h_long()
and to_h2h_mat()
which convert from matrix to long and from long to matrix respectively. Note that output of to_h2h_long()
has player1
and player2
columns as characters. Examples:
cr_long %>% h2h_mat(mean(score1)) %>% to_h2h_long() cr_long %>% h2h_long(mean_score1 = mean(score1), mean_score2 = mean(score2)) %>% to_h2h_mat()
All this functionality is powered by useful outside of comperes
functions long_to_mat()
and mat_to_long()
. They convert general pair-value data between long and matrix format:
pair_value_long <- tibble( key_1 = c(1, 1, 2), key_2 = c(2, 3, 3), val = 1:3 ) pair_value_mat <- pair_value_long %>% long_to_mat(row_key = "key_1", col_key = "key_2", value = "val") pair_value_mat pair_value_mat %>% mat_to_long( row_key = "key_1", col_key = "key_2", value = "val", drop = TRUE )
For some ranking algorithms it crucial that games should only be between two players. comperes
has function to_pairgames()
for this. It removes games with one player. Games with three and more players to_pairgames()
splits into separate games between unordered pairs of different players without specific order. Note that game identifiers are changed to integers but order of initial games is preserved. Example:
to_pairgames(cr_long)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.