set.seed(1) knitr::opts_chunk$set(echo = TRUE) library("hyper2")
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))
To cite the hyper2
package in publications, please use @hankin2017_rmd;
takes about five minutes without cache. Suppose there are five people
in a family. Each day for five days, one person cooks a meal. After
the fifth day, each person considers the four meals cooked by other
family members, and puts them in order of preference. This process is
repeated and here I consider how to interpret such data.
library(hyper2,quietly=TRUE)
cooking_table <- as.matrix(read.table("cooking_weekly.txt", header=TRUE)) head(cooking_table) tail(cooking_table)
Thus from the first row we find that one judge considered Lesley to be
fourth best, Alice third, Zac second and Annabel best (this must have
been Robin: no-one is allowed to vote for themselves and a zero means
no vote). This is the transpose of an order table such as
skating.txt
. In this context a zero contains no information (unlike
the formula 1 tables in which zeros signify did not finish). Document
curling.Rmd
contains a discussion and here we will use the same
method. First we will consider the support function for the first row only:
(day1 <- cooking_table[1,]) H_row1 <- rank_likelihood(day1[day1>0])
In the above, we calculate the Plackett-Luce likelihood function; observe that it is uninformative about competitor number 1, which I show in two ways. Firstly:
p <- c(0.3,0.2,0.1,0.3) # Now create p2 which is the same but competitor 1 is stronger: p2 <- p alpha <- 1.3 beta <- (1-alpha*p[1])/(1-p[1]) p2[1] <- p2[1]*alpha # person 1 becomes stronger... p2[-1] <- p2[-1]*beta # ...and everyone else becomes weaker loglik(indep(p),H_row1) loglik(indep(p2),H_row1)
It is easy to iterate over the rows of the dataset:
H <- hyper2() for(i in seq_len(nrow(cooking_table))){ # iterate over rows jj <- cooking_table[i,] H <- H + rank_likelihood(jj[jj>0]) } H
NOTE: converting cooking_table
to an order table is not correct,
for the zero is interpreted as "did not finish", but the correct
interpretation is "did not take part": the difference is that DNF
implies lower trength, but the likelihood function is uninformative
about DNTP players. Even so, we may use ordertable2supp()
:
tc <- t(cooking_table) tc[,1:20]
(compare file curling.Rmd
in which we iterate over columns not rows).
Standard techniques may be used:
equalp.test(H) mH <- maxp(H) mH pie(mH)
It seems that the boys tend to prefer the boys' cooking and the girls prefer the girls' cooking. But simple things first:
a <- cooking_table # saves typing a <- a[a[,4]>0,] # just lines where zac does not vote jj <- table(robin_votes=a[,1]==0,zac_last=a[,4]==4) jj
The table above shows that Robin cast a total of r sum(jj[2,])
votes
for Zac, and everyone else combined cast a total of r sum(jj[1,])
votes for Zac. Everyone else voted Zac last a total of r jj[1,2]
times and higher than last r jj[1,1]
times:
fisher.test(jj,alternative="less")
So it looks like Robin is less likely to vote Zac last than everyone else. Does he do likewise for me?
a <- cooking_table # saves typing a <- a[a[,1]>0,] # just lines where robin does not vote jj <- table(zac_votes=a[,4]==0,robin_last=a[,1]==4) jj fisher.test(jj,alternative="less")
No evidence for that.
We can do slightly better, and test a more general hypothesis,
specifically that the girls systematically vote other girls higher,
and boys systematically vote boys higher, than warranted. To test
this hypothesis, we posit two reified entities, girl
and boy
.
Entity girl
helps a girl when another girl is voting for her, and
entity boy
helps a boy when another boy is voting for him.
cooking_table <- as.matrix(read.table("cooking_weekly.txt", header=TRUE)) cooking_table <- cooking_table[rowSums(cooking_table==0)==1,] boys <-c("robin","zac") girls <- c("lesley","alice","annabel") cooks <- colnames(cooking_table) P <- hyper2() P0 <- hyper2() for(i in seq_len(nrow(cooking_table))){ # iterate over rows x <- cooking_table[i,] if(sum(x==0)>1){break} names(x) <- cooks if(names(x[x==0]) %in% boys){ # if the voter is a boy x <- x[x>0] # remove the voter Pjj <- pwa(rank_likelihood(x),names(x)[names(x) %in% boys], chameleon="boy") } else if (names(x[x==0]) %in% girls){ # voter a girl x <- x[x>0] Pjj <- pwa(rank_likelihood(x),names(x)[names(x) %in% girls], chameleon="girl") } else { stop() } P <- P + Pjj } mP <- maxp(P)
mP
pie(mP)
Now test whether the girls preferentially vote:
specificp.gt.test(P,"girl",0)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.