knitr::opts_chunk$set(echo = TRUE) library("hyper2") library("magrittr") abbreviated <- TRUE # change to FALSE for full names
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))
To cite the hyper2
package in publications, please use @hankin2017_rmd;
takes about an hour to run without cache. Here I use the hyper2
package to quantify the strength of a circular rock-paper-scissors
effect using the hyper2
package, improving on previous results
[@west2008] which merely established its existence. I use two
datasets, a three-player scoreline and a synthetic dataset referring
to wear on a seven-tooth gear wheel.
Consider the following dataset:
chess_table <- as.matrix(read.table("chess.txt")) dimnames(chess_table) <- list( match=c("T_vs_A","A_vs_K","T_vs_K"), player=c("Topalov","Anand","Karpov")) chess_table
This is documented in chess.Rd
and shows that Topalov played Anand
$22+13=35$ times, winning 22 and losing 13 matches. The corresponding
likelihood function is chess
:
chess <- saffy(chess_table) chess chess_maxp <- maxp(chess) chess_maxp
(we refer to this as the "TAK" model). We might ask whether there is evidence for the players having differing strengths:
equalp.test(chess)
Thus we fail to reject $H_0\colon p_T=p_A=p_K$ under T-A-K as the p-value is about 0.39.
The existence of a rock-paper-scissors relationship between the players was discussed in @hankin2010 and the null of no RPS phenomena could be rejected using the Alymer test.
Here we improve on that work by introducing a monster, M
, that
accounts for the rock-paper-scissors nature of the players. The RPS
monster helps players that benefit from the RPS nature of the result.
The simplest nontrivial probability model is that all three real
players have the same strength: $p_T=p_A=p_K$, which we write $T=A=K$.
Under $T=A=K$, writing $p_T=p_A=p_K=\alpha$ we can use a binomial test
to calculate the p-value for the hypothesis that $p_M=0$:
binom.test(sum(diag(chess_table)),sum(chess_table,na.rm=TRUE),alternative="greater")
Estimating $p_M$ is also straightforward:
$$ Prob(\mathrm{procyclic}\ \mathrm{win})=\frac{\alpha+M}{2\alpha+M} $$
(a "procyclic win" is a win that would be expected under a strong monster). Here we also have $3\alpha+M=1$, so to estimate the strength of the monster we would solve
$$ \frac{\alpha+M}{2\alpha+M}=p,\qquad{3\alpha + M}=1 $$
for $\alpha$ and $M$, where $p$ is the probability of a procyclic win. This gives us
$$ \alpha=\frac{1-p}{2-p},\qquad M=\frac{2p-1}{2-p}. $$
If we use the maximum likelihood estimate value for a procyclic win of $\frac{55}{88}=0.625$ we find that $\hat{M}=0.18$ approximately. Further, we may express $p$, the probability of a pro-cyclic win, as $p=\frac{1+2M}{2+M}$. If we observe $a$ procyclic wins and $b$ countercyclic wins a likelihood function for $M$ would then be
$$ \mathcal{L}(M)= p^a(1-p)^b=\left(\frac{1+2M}{2+M}\right)^a\cdot \left(\frac{1-M}{2+M}\right)^b $$
and here we have $a=22+23+10=55$ and $b=13+12+8=33$ (a graph is given
below). It is straightforward but not trivial to translate this into
hyper3
idiom. Noting that $3\alpha+M=1$ we define $\beta=3\alpha$
and then the unit sum constraint is $\beta+M=1$ and our probabilities
are
$$ \frac{\alpha+M;\alpha}{2\alpha+M } = \frac{\beta/3+M;\beta/3}{2\beta/3+M} = \frac{1+2M;1-M}{2+M} $$
(where the last expression is obtained using $\beta+M=1$). We may set
up a log-likelihood function of $a,b$ using hyper3
idiom as follows:
TqAqK <- function(a,b){ H3 <- hyper3(pnames=c("beta","M")) H3[c(beta=1/3,M=1)] %<>% inc(a ) H3[c(beta=1/3 )] %<>% inc( b) H3[c(beta=2/3,M=1)] %<>% dec(a+b) return(H3) } H3 <- TqAqK(55,33) loglik(c(M=0.1,beta=0.9),H3) loglik(c(beta=0.9,M=0.1),H3) # NB: identical!
[above, we note that $\beta+M=3\alpha+M=1$, so the monster is the fillup value with respect to $\beta$]. This is easy to maximize:
optimize(function(M){loglik(c(M=M,beta=1-M),H3)},c(0.01,0.99),maximum=TRUE)
This gives maximum likelihood estimates of $M=0.18$ as would be expected from the simple binomial model; the probability of a procyclic win is about $0.625$.
We may relax the assumption of equal strength of the real players by using a more sophisticated hyper2 construction:
H <- hyper2() H[c("Topalov", "M")] %<>% inc(22 ) H[c( "Anand" )] %<>% inc( 13) H[c("Topalov","Anand", "M")] %<>% dec(22+13) H[c( "Anand", "M")] %<>% inc( 23) H[c( "Karpov" )] %<>% inc( 12) H[c( "Anand","Karpov","M")] %<>% dec(23+12) H[c("Topalov" )] %<>% inc( 8 ) H[c( "Karpov","M")] %<>% inc( 10) H[c("Topalov", "Karpov","M")] %<>% dec( 8+10) H
(in the above, a win by a player helped by the RPS effect has M
added to his strength). We can call this the TAKM model, and find the
evaluate:
maxp(H)
We see that, under TAKM, the monster has strength of about 0.16, slightly less than the 0.18 obtained for $M$ under the equality assumption $p_T=p_A=p_K$. Further, under TAKM, the three "real" players are again all of approximately equal strength, and we may test this:
samep.test(H,c("Topalov","Anand","Karpov"))
(note the estimated monster strengths of 0.16 and 0.18 agree with the previous estimate). Thus the p-value of about 0.77 indicates failure to reject the null (compare 0.39 under TAK). Now we can test $H_0\colon p_M=0$ under TAKM:
specificp.gt.test(H,"M",0)
(in the test above, note that the players may appear in a different order from previously). The test shows that we may reject the hypothesis that $p_M=0$ under TAKM with a p-value of $0.04$ (compare 0.012 from the binomial test).
We may plot a profile likelihood curve for $M$:
M <- seq(from=0.001,to=0.4,len=40) L1 <- profsupp(H,"M",M) L1 <- L1-max(L1) L2 <- 55*log(1+2*M)+33*log(1-M)-88*log(2+M) L2 <- L2-max(L2) plot(M,L1,type="l",col="black",ylab="log likelihood",xlab="Monster strength") points(M,L2,type="l",col="red") abline(h=c(0,-2),lty=2) legend("topright",col=c("black","red"),lty=1,legend=c("TAKM","T=A=K"))
Above we see support curves for RPS monster strength using TAKM and
$T=A=K$, showing very similar results. Note that the T=A=K curve
could have been obtained more nicely with function TqAqK
but I
haven't got round to changing it yet.
West and Hankin (2008) consider a synthetic dataset referring to seven teeth of a gear wheel. Consecutive pairs are repeatedly compared and the most worn tooth is nominated by independent judges. Here a dataset of the same form will be used, but the entries have been changed for pedagogical reasons:
t1 <- "t1" t2 <- "t2" t3 <- "t3" t4 <- "t4" t5 <- "t5" t6 <- "t6" t7 <- "t7" M <- "M"
hw <- matrix(c( 16,05,NA,NA,NA,NA,NA, NA,06,09,NA,NA,NA,NA, NA,NA,05,11,NA,NA,NA, NA,NA,NA,06,12,NA,NA, NA,NA,NA,NA,05,09,NA, NA,NA,NA,NA,NA,04,07, 13,NA,NA,NA,NA,NA,05), 7,7,byrow=TRUE) colnames(hw) <- paste("t",1:7,sep="") hw
Thus, considering the top row, we see that in a total of $16+5=21$
independent comparisons between t1
and t2
, tooth t1
was judged
to be the more worn one 16 times, and t2
was judged to be the more
worn one $5$ times (NB: in the above, I have changed the dataset in
order to make t1
stronger than the other players while maintaining
the RPS phenomenon). We may convert this dataset to a generalized
hyper2
object. First a standard Bradley-Terry:
HW <- hyper2() rs <- -rowSums(hw,na.rm=TRUE) cs <- colSums(hw,na.rm=TRUE) HW[t1] <- cs[1] HW[t2] <- cs[2] HW[t3] <- cs[3] HW[t4] <- cs[4] HW[t5] <- cs[5] HW[t6] <- cs[6] HW[t7] <- cs[7] HW[c(t1,t2)] <- rs[1] HW[c(t2,t3)] <- rs[2] HW[c(t3,t4)] <- rs[3] HW[c(t4,t5)] <- rs[4] HW[c(t5,t6)] <- rs[5] HW[c(t6,t7)] <- rs[6] HW[c(t7,t1)] <- rs[7] HW
(likelihood function HW
has no monster). We can then analyse this:
maxp(HW) equalp.test(HW)
Thus according to this model we may reject the hypothesis of equal strength. We now incorporate the monster:
HWM <- hyper2() tt <- c("t1","t2","t3","t4","t5","t6","t7") for(i in 1:6){ HWM[c(tt[i] )] <- hw[i,i ] HWM[c( tt[i+1],M)] <- hw[i,i+1] HWM[c(tt[i],tt[i+1],M)] <- rs[i] } HWM[c(tt[7] )] <- hw[7,7] HWM[c( tt[1],M)] <- hw[7,1] HWM[c(tt[7],tt[1],M)] <- rs[7] HWM
Then the same analysis applies:
maxHW <- maxp(HWM) maxHW
and we can test the hypothesis that $p_M=0$:
specificp.gt.test(HWM,"M",0)
suggesting that we may reject the hypothesis of zero monster strength with a p-value of about 4.7\%, even after accounting for the different strengths of the teeth.
hyper3
We can now use hyper3
to analyse the same dataset. Here the
likelihood function for pairwise competition between $p_1$ and $p_2$,
with $a$ procyclic wins and $b$ anticyclic wins, is
$$ \left(\frac{\lambda p_1}{\lambda p_1 + p_2}\right)^a \cdot \left(\frac{p_2}{\lambda p_1 + p_2}\right)^b $$
where $\lambda$ is a quantification of the asymmetry. We can modify
the hyper2
idiom to make a hyper3
object instead:
H3seven <- function(lambda){ H3 <- hyper3() for(i in 1:6){ jj <- 1 # diagonal players do not have lambda-advantage names(jj) <- tt[i] H3[jj] %<>% inc(hw[i,i]) H3[setNames(1,tt[i])] %<>% inc(hw[i,i]) jj <- lambda names(jj) <- tt[i+1] H3[jj] %<>% inc(hw[i,i+1]) jj <- c(1,lambda) names(jj) <- c(tt[i],tt[i+1]) H3[jj] %<>% dec(hw[i,i] + hw[i,i+1]) } # i loop closes ## Now do final one: jj <- 1 names(jj) <- tt[7] H3[jj] %<>% inc(hw[7,7]) jj <- lambda names(jj) <- tt[1] H3[jj] %<>% inc(hw[7,1]) jj <- c(1,lambda) names(jj) <- c(tt[7],tt[1]) H3[jj] %<>% dec(hw[7,7] + hw[7,1]) return(H3) } # function H3seven() defn closes
Now we can execute it:
H3 <- H3seven(lambda=1.66) H3
profl <- function(lambda){ H <- H3seven(lambda) return(loglik(maxp(H),H)) }
First optimize the likelihood:
o <- optimize(profl,c(1.4,1.6),maximum=TRUE)
o
So now plot a profile likelihood curve:
lam <- seq(from=0.8,to=2.6,len=30) like3 <- sapply(lam,profl)
plot(lam,like3-o$objective,type='b',ylab="likelihood",xlab="lambda") abline(h=c(0,-2)) abline(v=c(1,o$maximum))
Following lines create chess.rda
, residing in the data/
directory of the package.
save(chess_table,chess,chess_maxp,file="chess.rda")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.