set.seed(0)
knitr::opts_chunk$set(echo = TRUE)
library("hyper2")
library("magrittr")
library("igraph")
options("digits" = 5)
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))

To cite the hyper2 package in publications, please use @hankin2017_rmd. The skeleton is a winter sliding sport. This short document discusses a dataset taken from the 2018 Winter Olympics and analyses the order statistic for the competitors.

(skeleton_table <- read.table("skeleton2022.txt",header=TRUE))

Thus the top line shows that Grotheer came first in runs 1-3 and sixth in run 4. The first column shows the results for run 1: we see that Grotheer came first, Jungk came fifth, and so on. There is in fact no dominated set: Timmings [place 25] beats Enzie [place 19] (in run 4), Enzie [19] beats Crumpton [22] (run 1), Crumpton [17] beats Wyatt [18] (2), Wyatt [16] beats Maier [15] (1), Maier [11] beats Bagnis [12] (4), Bagnis [10] beasts Seunggi [11] (1), Seunggi [9] beats T. Dukurs [10] (3 and 4), Dukurs [7] beats Tretyakov [8] (2), Tretyakov [2] beats Wengang [3] (1), and Wengang beats everybody (4).

We can try to automate the above cycle as follows. The first step is to convert skeleton_table to a numeric matrix, which proves to be a very awkward process. NB: data.matrix() buggers things up:

head(data.matrix(skeleton_table))

Above, we see that Grotheer apparently came 17th in run 4, clearly some weird infelicity. OK, how to coerce our dataframe to a numeric matrix? Possible but extremely klunky:

st <- matrix(as.numeric(c(as.matrix(skeleton_table))),ncol=4)
rownames(st) <- rownames(skeleton_table)
colnames(st) <- colnames(skeleton_table)
st
st[is.na(st)] <- 999

Taking Maier as an example (a mid-range competitor) we want to identify competitors who Maier beats at least once:

bm1 <- st[,1] > st["Maier",1]  # lost to Maier in run 1
bm2 <- st[,2] > st["Maier",2]  # lost to Maier in run 2
bm3 <- st[,3] > st["Maier",3]  # lost to Maier in run 3
bm4 <- st[,4] > st["Maier",4]  # lsot to Maier in run 4
st[bm1 | bm2 | bm3 | bm4,]

Above we see a subset of rows of skeleton_table corresponding to competitors who lost to Maier.

rownames(st[bm1 | bm2 | bm3 | bm4,])
jj <- cbind(won="Maier",lost=rownames(st[bm1 | bm2 | bm3 | bm4,]))
noquote(jj)
plot(graph(c(t(jj))))

Above we see a directed graph, the arrow means "beats", so for example Maier beats Timmings (2 o'clock). We can functionify the calculation and define beats():

beats <- function(p){   # returns everyone whom competitor 'p' beats
  bm1 <- st[,1] > st[p,1]  
  bm2 <- st[,2] > st[p,2]  
  bm3 <- st[,3] > st[p,3] 
  bm4 <- st[,4] > st[p,4] 
  jj <- cbind(p,rownames(st[bm1 | bm2 | bm3 | bm4,]))
  return(c(t(jj)))
}
bb <- c()
for(i in rownames(st)){bb <- c(bb,beats(i))}
graph(bb)
plot(graph(bb))

Above we see a directed graph. The aim was to detect dominated sets, that is, a subset of competitors who are always beaten by everyone outside the group. But although there is no such group in skeleton_table, this fact is not immmediately obvious from the graph. We can however consider just the top and bottom few competitors and see if that makes a difference.

(st2 <- st[-(4:19),])  # exclude a bunch of mid-range competitors

Above, it is pretty clear that there is a dominated set of everyone except the top 3.

bb <- c()
for(i in rownames(st2)){bb <- c(bb,beats(i))}
graph(bb)
plot(graph(bb),layout = layout.fruchterman.reingold)

Above we see that the existence of the dominated set is not obvious without close scrutiny. Observing that Grotheer, Jungk, and Wengang dominate the other competitors, we can see that these three have arrows pointing to each other within the group, but noone outside the group has an arrow pointing to any member of the set. But this is not obvious from inspection.

Rank table analysis {-}

We can present the result table in another way:

wikitable_to_ranktable(skeleton_table)

Above we see, in the first column, that Grotheer came first ("c1") in runs 1-3 and Wengang came first in run 4. From the first column we see that in run 1, Grotheer came fist, Tretyakov came second, Wengang came third, and so on.

skeleton <- ordertable2supp(skeleton_table)
skeleton_maxp <- maxp(skeleton)
skeleton_maxp
pie(skeleton_maxp)
dotchart(skeleton_maxp)
dotchart(log(skeleton_maxp))

And we can test the null that the competitors all have the same Plackett-Luce strengths:

equalp.test(skeleton)

We see very strong evidence for different competitor strengths, at $p=2\times 10^{-33}$. How strong is the evidence that Grotheer is better than Jungk?

samep.test(skeleton,c("Grotheer","Jungk"))

Not very strong.

samep.test(skeleton,c("Grotheer","Jungk"))$p.value
p <- rownames(st)
pvals <- seq_along(p)
for(i in seq(from=2,to=length(p))){ pvals[i] <- samep.test(skeleton,c(p[1],p[i]))$p.value }
names(pvals) <- p
pvals

(NB the above chunk takes about an hour to run, even with n=1).

plot(pvals,log='y',pch=16)
for(i in seq_along(pvals)){text(i,pvals[i],names(pvals)[i],pos=4,col='gray')}

Package dataset {-}

Following lines create skeleton.rda, which is not part of the CRAN version of the package.

save(skeleton_table,skeleton,skeleton_maxp,file="skeleton.rda")

References {-}



RobinHankin/hyper2 documentation built on May 6, 2024, 4:25 p.m.