In RobinHankin/hyper2: The Hyperdirichlet Distribution, Mark 2

set.seed(0)
knitr::opts_chunk$set(echo = TRUE)
library("hyper2")
library("magrittr")
library("prefmod")
library("PlackettLuce")
options("digits" = 5)

knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))

To cite the hyper2 package in publications, please use @hankin2017_rmd. This short document shows how to apply hyper2 idiom to the salad dataset of the prefmod package [@hatzinger2012]. From salad.Rd:

The dataset contains the rankings of four salad dressings
concerning tartness by 32 judges, with values ranging
from 1 (most tart) to 4 (least tart).

head(salad)
nrow(salad)

From row 3, for example, we see that salad A was ranked as second most tart, B the most tart, C the third most tart, and D the least tart of the four. We may process this using two methods, explicit and slick. First, explicit:

H1 <- hyper2()
for(i in seq_len(nrow(salad))){
   H1 <-  H1 + suppfun(as.matrix(salad)[i,])
}
H1

And second, a slick method. We take the transpose of salad (things are clearer if we name the rows):

rownames(salad) <-
   paste("j",formatC(seq_len(nrow(salad)),width=2,format="d",flag="0"),sep="")
ts <- ordertable(t(salad))

Above, we see that object ts is an order table. It is just like the formula 1 result table F1_table_2016, except that venues are judges and drivers are the salads [which are "competing" for the "who has the most tart dressing" prize]. Converting this to a support function is accomplished with ordertable2supp():

(H2 <- suppfun(ts))

Just to check:

H1 == H2

equalp.test(H1)

The null estimate agrees to six places of decimals with that presented by @turner2020_nomarkup:

standardPL_PlackettLuce <- PlackettLuce(salad, npseudo = 0)
(p1 <- itempar(standardPL_PlackettLuce))
(p2 <- maxp(H1))
p1-p2

We may use hyper2 to test some hypotheses. Firstly, $H_0\colon p_B=\frac{1}{4}$, which we test with specificp.test():

specificp.test(H2,"B",1/4)

Above we see a $p$-value of about $10^{-7}$, clearly significant. However, we may also test the hypothesis that $p_B$ is less than the second strongest, $p_C$:

samep.test(H2,c("B","C"))

We see a $p$-value of about $7\times 10^{-4}$, much higher than the value from specificp.test() of $H_0\colon p_B=\frac{1}{4}$. This is because samep.test() allows one to change the value of $p_C$ and we would expect a less significant result.

Acid analysis

Turner goes on to assess whether a model of tartness using acetic and gluconic acids is appropriate. This is possible using the hyper2 formalism as follows.

features <- data.frame(salad = LETTERS[1:4],
                       acetic = c(0.5, 0.5, 1, 0),
                       gluconic = c(0, 10, 0, 10))
features
(M <- as.matrix(features[,-1]))
makestrengths <- function(v){
    out <- exp(M %*% v)
    out <- c(out/sum(out))
    out <- pmax(out,0)
    names(out) <- LETTERS[1:4]
    return(out)
}
makestrengths(c(2,1))
f <- function(v){loglik(makestrengths(v),H2)}
(maxo <- optim(par=c(3, 0.3), fn=f,control=list(fnscale = -1)))
makestrengths(maxo$par)
loglik(makestrengths(maxo$pa),H1)
maxp(H1,give=1)