set.seed(0) knitr::opts_chunk$set(echo = TRUE) library("hyper2") library("magrittr") options("digits" = 5)

This document analyses strengths of the competitors in the Ladies'
Figure Skating at the 2002 Winter Olympics. In the package, dataframe
`skating_table`

is copied from Lock and Lock (see skating.Rd for more
details). It also creates file `data/skating.rda`

which contains R
objects `skating`

, `skating_table`

, and `skating_maxp`

.

set.seed(0) skating_table <- read.table("skating.txt",header=TRUE) skating_table

Object `skating_table`

is an order table. It is structured so that
each competitor is a row, and each judge is a column. Function
`rank_likelihood()`

considers each row to be a race [or a judge], so
we need to take a transpose. The following R idiom is plausible but
**incorrect**:

skating_incorrect <- rank_likelihood(t(skating_table)) # incorrect

The above is incorrect because we are applying a rank technique to an order table. Correct analysis follows. One good thing to do is to present the dataset as an order table;

```
ordertable_to_ranktable(skating_table)
```

In the above, the column names should be read "came first" (`c1`

),
"came second" (`c2`

) and so on to "came 23rd" (`c23`

). From the first
column, we see that Hughes came first 5 times [according to judges
1,5,7,8,9] and Slutskya came first four times [judges 2,3,4,6]. From
the last column we see that all judges except `J3`

awarded Luca last
place. File `man/rrank.Rd`

has a detailed discussion of the
differences between rank and order.

We can convert an order table to a support function:

head(ordertable2supp(skating_table))

Further, we might ask how many judges ranked each competitor first, second and so on:

oo <- sapply(seq_len(23),function(i){rowSums(skating_table==i)}) colnames(oo) <- rep(" ",23) # nicer print oo

In the above, each row is a competitor and each column corresponds to a ranking. Thus the first column corresponds to "first place" and shows that five judges ranked Hughes first and four judges ranked Slutskaya first. The second column corresponds to "second place" and shows that one judge gave Hughes second place, two judges placed Slutskaya second, five gave Kawn second, and one gave Cohen second. The first row corresponds to Hughes and we see that, of $5+1+1+2=9$ judges, 5 placed Hughes first, one placed her second, one placed her third, and two placed her fourth.

The formal ordering used in competition is, according to Lock and Lock, given by the median ordinal:

apply(skating_table,1,median)

but with ties broken using a complicated hierarchy, the first of which is the "size of the majority":

rowSums(sweep(skating_table,1,seq_len(23))>=0)

which would suggest that Slutskaya beats Kwan (5-4), Butyrskaya draws
with Robinson (7-7), and so on. The rows of `skating_table`

are in
the order given by this system.

Now some data visualization. First the MLE for the strengths:

skating <- ordertable2supp(skating_table) skating_maxp <- maxp(skating) m <- skating_maxp # for ease of typing

dotchart(m) dotchart(log(m))

Looking at the dotcharts, it seems that the medallists---Hughes (gold), Slutskya (silver), Kwan (bronze)---were considerably higher in strength than the rest of the field. Here I will test the hypothesis that the medallists were in fact the strongest three competitors. Technically you need to optimize over the union of the possibilities that one of the three medallists did not come in the top three; but this is hard. We will do something much easier but numerically equivalent: optimize over the union of outcomes where either Cohen or Suguri (who placed fourth and fifth respectively) had higher strength than any of the medallists.

jj <- matrix(c( -1, 0, 0, 1,0, # Hughes < Cohen 0,-1, 0, 1,0, # Slutskya < Cohen 0, 0,-1, 1,0, # Kwan < Cohen -1, 0, 0, 0,1, # Hughes < Suguri 0,-1, 0, 0,1, # Slutskya < Suguri 0, 0,-1, 0,1), # Kwan < Suguri byrow=TRUE,ncol=5) problem_constraints <- # fill with zeros for other competitors cbind(jj,matrix(0,nrow(jj),size(skating)-ncol(jj)-1)) small <- 1e-4 # need a sensible start value satisfying the constraints start <- c(rep(2*small,3),rep(3*small,2),rep(small,17)) out <- rep(0,nrow(problem_constraints)) fullout <- list() for(i in seq_len(nrow(problem_constraints))){ jj <- maxp(skating, startp=start, give=TRUE,fcm=problem_constraints[i,], fcv=0,n=1,SMALL=1e-5) fullout[[i]] <- jj out[i] <- jj$value } out

Now compare these values with the unconstrained maximum likelihoods:

mgv <- maxp(skating, give=TRUE, n=10)$value mgv - out mgv - max(out)

that is, a little over 1 unit of support, falling short of the two
units suggested by Edwards. Observe that the maximum likelihood among
the six alternative hypotheses is that of number 3, in which the
maximization was constrained to obey `Kwan < Cohen`

.

If, instead, we ask whether there is evidence that Suguri should not have been a medallist, we find

mgv - max(out[4:6])

We can be more specific with the likelihoods:

plot(out,ylab="log likelihood",axes=FALSE) axis(2) axis(side=1,at=1:6,srt=45,labels=c( "Hug<Coh", "Slu<Coh", "Kwa<Coh", "Hug<Sug", "Slu<Sug", "Kwa<Sug" ))

In the plot above, the vertical axis shows the support. The six
points on the x-axis correspond to the six rows of
`problem_constraints`

; names have been abbreviated to the first three
letters. Thus the first three points are maximum likelihoods for
`Hughes < Cohen`

, `Slutskya < Cohen`

, and `Kwan < Cohen`

respectively.

maximum likelihood estimated strengths furnish an ordering for the competitors. We can compare ranking by strengths with the official point-tallying method:

rL <- sort(skating_maxp,decreasing=TRUE) rL[] <- seq_along(rL) rO <- seq_len(nrow(skating_table)) names(rO) <- rownames(skating_table) ordertransplot(rO,rL,xlab="offical rank",ylab="likelihood rank",main="Ladies free skating, 2002 Winter Olympics")

In figure \@ref(fig:rLrO), there is close agreement between the two methods in the large, but differences in detail. For example, the official ordering is hughes first, then slutskaya second, then kwan third; likelihood winner is slutskya, followed by hughes and then kwan. We can gain some understanding of this result by looking at the raw judging results for the three medallists:

head(skating_table,2)

and then calculating a table of the results:

hughes <- unlist(skating_table[1,]) slutskaya <- unlist(skating_table[2,]) table(hughes) table(slutskaya)

Looking at the above, we see that Hughes has five judges who gave her
"1"s, while Slutskaya has only 4, which is why she was considered to
be the top. It seems that Hughes's two "4"s (compared with
Slutskaya's one) have cost her more likelihood-based strength than her
extra "1" gave her; note also that the two judges who placed Highes
fourth (viz `J2`

and `J4`

) placed Slutskaya first. Also:

table(hughes < slutskaya)

(note that equality is disallowed) showing that five judges preferred Hughes to Slutskaya and four preferred Slutsakaya to Hughes.

skating_table o <- ordertable_to_ranktable(skating_table) o

class(o) <- "matrix" o

Select just the top seven:

o <- o[,1:7] o

and now replace ranks with internal ranks (that is, rank among the top seven):

o <- t(apply(o,1,rank)) colnames(o) <- rownames(skating_table)[1:7] o

Now coece to a `ranktable`

:

class(o) <- "ranktable" o

Thence to an order table:

o <- ranktable_to_ordertable(o) o

Now create a likelihood function:

o <- ordertable2supp(o) o

and have some fun with it:

(o <- maxp(o)) pie(o)

Following lines create `skating.rda`

, residing in the `data/`

directory of the package.

save(skating_table,skating,skating_maxp,file="skating.rda")

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.