set.seed(0) knitr::opts_chunk$set(echo = TRUE) library("hyper2") library("magrittr") library("igraph") options("digits" = 5)
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))
To cite the hyper2
package in publications, please use @hankin2017_rmd.
The skeleton is a
winter sliding sport. This short document discusses a dataset taken
from the 2018 Winter
Olympics
and analyses the order statistic for the competitors.
(skeleton_table <- read.table("skeleton2022.txt",header=TRUE))
Thus the top line shows that Grotheer came first in runs 1-3 and sixth in run 4. The first column shows the results for run 1: we see that Grotheer came first, Jungk came fifth, and so on. There is in fact no dominated set: Timmings [place 25] beats Enzie [place 19] (in run 4), Enzie [19] beats Crumpton [22] (run 1), Crumpton [17] beats Wyatt [18] (2), Wyatt [16] beats Maier [15] (1), Maier [11] beats Bagnis [12] (4), Bagnis [10] beasts Seunggi [11] (1), Seunggi [9] beats T. Dukurs [10] (3 and 4), Dukurs [7] beats Tretyakov [8] (2), Tretyakov [2] beats Wengang [3] (1), and Wengang beats everybody (4).
We can try to automate the above cycle as follows. The first step
is to convert skeleton_table
to a numeric matrix, which proves to be
a very awkward process. NB: data.matrix()
buggers things up:
head(data.matrix(skeleton_table))
Above, we see that Grotheer apparently came 17th in run 4, clearly some weird infelicity. OK, how to coerce our dataframe to a numeric matrix? Possible but extremely klunky:
st <- matrix(as.numeric(c(as.matrix(skeleton_table))),ncol=4) rownames(st) <- rownames(skeleton_table) colnames(st) <- colnames(skeleton_table) st st[is.na(st)] <- 999
Taking Maier as an example (a mid-range competitor) we want to identify competitors who Maier beats at least once:
bm1 <- st[,1] > st["Maier",1] # lost to Maier in run 1 bm2 <- st[,2] > st["Maier",2] # lost to Maier in run 2 bm3 <- st[,3] > st["Maier",3] # lost to Maier in run 3 bm4 <- st[,4] > st["Maier",4] # lsot to Maier in run 4 st[bm1 | bm2 | bm3 | bm4,]
Above we see a subset of rows of skeleton_table
corresponding to
competitors who lost to Maier.
rownames(st[bm1 | bm2 | bm3 | bm4,]) jj <- cbind(won="Maier",lost=rownames(st[bm1 | bm2 | bm3 | bm4,])) noquote(jj) plot(graph(c(t(jj))))
Above we see a directed graph, the arrow means "beats", so for example
Maier beats Timmings (2 o'clock). We can functionify the calculation
and define beats()
:
beats <- function(p){ # returns everyone whom competitor 'p' beats bm1 <- st[,1] > st[p,1] bm2 <- st[,2] > st[p,2] bm3 <- st[,3] > st[p,3] bm4 <- st[,4] > st[p,4] jj <- cbind(p,rownames(st[bm1 | bm2 | bm3 | bm4,])) return(c(t(jj))) } bb <- c() for(i in rownames(st)){bb <- c(bb,beats(i))} graph(bb) plot(graph(bb))
Above we see a directed graph. The aim was to detect dominated sets,
that is, a subset of competitors who are always beaten by everyone
outside the group. But although there is no such group in
skeleton_table
, this fact is not immmediately obvious from the
graph. We can however consider just the top and bottom few
competitors and see if that makes a difference.
(st2 <- st[-(4:19),]) # exclude a bunch of mid-range competitors
Above, it is pretty clear that there is a dominated set of everyone except the top 3.
bb <- c() for(i in rownames(st2)){bb <- c(bb,beats(i))} graph(bb) plot(graph(bb),layout = layout.fruchterman.reingold)
Above we see that the existence of the dominated set is not obvious without close scrutiny. Observing that Grotheer, Jungk, and Wengang dominate the other competitors, we can see that these three have arrows pointing to each other within the group, but noone outside the group has an arrow pointing to any member of the set. But this is not obvious from inspection.
We can present the result table in another way:
wikitable_to_ranktable(skeleton_table)
Above we see, in the first column, that Grotheer came first ("c1
")
in runs 1-3 and Wengang came first in run 4. From the first column we
see that in run 1, Grotheer came fist, Tretyakov came second, Wengang
came third, and so on.
skeleton <- ordertable2supp(skeleton_table) skeleton_maxp <- maxp(skeleton) skeleton_maxp pie(skeleton_maxp) dotchart(skeleton_maxp) dotchart(log(skeleton_maxp))
And we can test the null that the competitors all have the same Plackett-Luce strengths:
equalp.test(skeleton)
We see very strong evidence for different competitor strengths, at $p=2\times 10^{-33}$. How strong is the evidence that Grotheer is better than Jungk?
samep.test(skeleton,c("Grotheer","Jungk"))
Not very strong.
samep.test(skeleton,c("Grotheer","Jungk"))$p.value
p <- rownames(st) pvals <- seq_along(p) for(i in seq(from=2,to=length(p))){ pvals[i] <- samep.test(skeleton,c(p[1],p[i]))$p.value } names(pvals) <- p pvals
(NB the above chunk takes about an hour to run, even with n=1
).
plot(pvals,log='y',pch=16) for(i in seq_along(pvals)){text(i,pvals[i],names(pvals)[i],pos=4,col='gray')}
Following lines create skeleton.rda
, which is not part of the CRAN version of the package.
save(skeleton_table,skeleton,skeleton_maxp,file="skeleton.rda")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.