set.seed(1) knitr::opts_chunk$set(echo = TRUE) library("hyper2")
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))
To cite the hyper2
package in publications, please use @hankin2017_rmd;
takes about an hour to process without cache. Here I consider
pairwise comparisons with a home-ground advantage and wish to compare
two likelihood functions. First, the monster parametrization:
$$ \mathcal{L_M}\left(p_1,p_2,M\right)= \left( \frac{p_1+M;p_2}{p_1+p_2+M} \right)^{a;b}\cdot \left( \frac{p_1;p_2+M}{p_1+p_2+M} \right)^{c;d} $$
[here the first expression is for $p_1$ at home and $p_2$ away; the second is $p_1$ away and $p_2$ at home; $a$ is number of home wins for $p_1$, $b$ number of away wins for $p_2$, $c$ number of away wins for $p_1$ and $d$ number of home wins for $p_2$]. Secondly the monster parameterization:
$$ \mathcal{L_\lambda}\left(p_1,p_2,\lambda\right)= \left( \frac{\lambda p_1;p_2}{\lambda p_1+p_2} \right)^{a;b}\cdot \left( \frac{p_1;\lambda p_2}{p_1+\lambda p_2} \right)^{c;d} $$
A suitable generalized function would be
$$ \mathcal{L}\left(p_1,p_2,\lambda,M\right)= \left( \frac{\lambda p_1+M;p_2}{\lambda p_1+p_2+M} \right)^{a;b}\cdot \left( \frac{p_1;\lambda p_2+M}{p_1+\lambda p_2+M} \right)^{c;d} $$
jj <- matrix(c(NA, 6+2i ,6+5i, NA),2,2,byrow=TRUE) teams <- LETTERS[1:2] dimnames(jj) <- list("@home" = teams,"@away"=teams) dimnames(jj) <- list("@home" = teams,"@away"=teams) print(jj)
Above we see that A played at home 3 times, winning 1 and losing 2 [row 1 column 2 is $1+2i$]; B played at home 9 times, winning 4 and losing 5 [row 2 column 1 is $4+5i$]. In its most general form, the $2\times 2$ result matrix would be
$$ \begin{bmatrix} -&\alpha+\beta i \ \gamma + \delta i&-\ \end{bmatrix} $$
and the log-likelihood function would be, algebraically,
$$ \mathcal{S}=\log\mathcal{L}= \log\left(\frac{A^\delta\cdot(A + M)^\alpha\cdot B^\beta (B + M)^\gamma }{ (A + B + M)^{\alpha+\beta+\gamma+\delta} }\right) = \delta\log A + \alpha\log(1-B) + \beta\log B + \gamma\log(1-A) $$
where $M$ is the strength of the home monster home
and $A+B+M=1$, so
we understand that $M=1-A-B$.
$$ \frac{\partial\mathcal{S}}{\partial A}=\frac{\delta}{A}-\frac{\gamma}{1-A}\qquad \frac{\partial\mathcal{S}}{\partial B}=\frac{\beta }{B}-\frac{\alpha}{1-A} $$
At a maximum, therefore,
$$ \hat{A}=\frac{\delta}{\delta + \gamma}\qquad \hat{B}=\frac{\beta }{\alpha + \beta} $$
Just to check, we evaluate the Hessian matrix and find
$$H=- \left(\begin{array}{cc} \frac{\partial^2\mathcal{S}}{\partial A\partial A} & \frac{\partial^2\mathcal{S}}{\partial A\partial B} \ \frac{\partial^2\mathcal{S}}{\partial B\partial A} & \frac{\partial^2\mathcal{S}}{\partial b\partial B} \end{array}\right) = \left(\begin{array}{cc} \delta A^{-2} + \gamma(1-A)^{-2}&0\ 0&\beta B^{-2} + \alpha(1-B)^{-2} \end{array}\right) $$
which is clearly strictly positive-definite if $00$, $\gamma+\delta>0$.
We will calculate numerical estimates using the monster-parameterization for a simple case:
M <- matrix(c( NA, 5+1i, 9+2i, NA), nrow=2,ncol=2,byrow=TRUE) teams <- LETTERS[seq_len(2)] dimnames(M) <- list("@home" = teams,"@away"=teams) dimnames(M) <- list("@home" = teams,"@away"=teams) M (Hsimp <- home_away(M)) maxp(Hsimp)
Compare the theoretical value of
theoretical <- c(A=2/(2+9),B=1/(1+5)) fillup(theoretical,Hsimp)
showing good agreement. We note that the model predicts a probability of $\hat{A}/(\hat{A}+\hat{B})=\frac{12}{23}\simeq 0.52$ for $A$ winning in a match between $A$ and $B$ where the home team advantage is not operating (on neutral ground, for example).
Now the $\lambda$ parameterization. The likelihood function for the same dataset is
$$ \mathcal{S}=\log\mathcal{L}= \log\left( \frac{(\lambda A)^\alpha B^\beta}{(\lambda A+B)^{\alpha+\beta}} \cdot \frac{A^\gamma (\lambda B)^\delta}{(A+\lambda B)^{\gamma+\delta}} \right)$$
This is equal to $(\alpha+\delta)\log\lambda +(\alpha+\gamma)\log A+(\beta+\delta)\log B -(\alpha+\beta)\log(\lambda A+B) -(\gamma+\delta)\log(A+\lambda B)$ and even after simplifying [we may assume $A+B=1$] the result does not appear to have a simple analytical solution; we must proceed numerically. To do so,
we need to define a
slight generalization of home_away3()
:
`ha3` <- function(home_games_won,away_games_won,lambda){ # based on home_away3() if(is.complex(home_games_won)){ if(missing(lambda)){lambda <- away_games_won} away_games_won <- Im(home_games_won) home_games_won <- Re(home_games_won) } teams <- rownames(home_games_won) stopifnot(identical(teams,colnames(home_games_won))) stopifnot(identical(teams,rownames(away_games_won))) stopifnot(identical(teams,colnames(away_games_won))) H <- hyper3(pnames=c(teams,"lambda","M")) for(i in seq_len(nrow(home_games_won))){ for(j in seq_len(ncol(home_games_won))){ if(i != j){ home_team <- teams[i] away_team <- teams[j] home_wins <- home_games_won[i,j] away_wins <- away_games_won[i,j] no_of_matches <- home_wins + away_wins ## home wins: jj <- c(lambda,1) names(jj) <- c(home_team,"M") H[jj] %<>% inc(home_wins) ## away wins: jj <- 1 names(jj) <- away_team H[jj] %<>% inc(away_wins) ## denominator jj <- c(lambda,1,1) names(jj) <- c(home_team,away_team,"M") H[jj] %<>% dec(no_of_matches) } # if(i != j) closes } # j loop closes } # i loop closes return(H) }
(maybe one day I will include function ha3()
in the hyper2
package).
Now a more complicated example, a
modified version of the 4x4 matrix on home_advantage.Rmd
:
home_games_won <- matrix(c( NA, 16, 12, 11, 19, NA, 19, 16, 17, 12, NA, 11, 11, 12, 12, NA), nrow=4,ncol=4,byrow=TRUE) away_games_won <- matrix(c( NA, 05, 02, 02, 9, NA, 10, 02, 3, 04, NA, 07, 8, 06, 04, NA), nrow=4,ncol=4,byrow=TRUE) teams <- LETTERS[1:4] dimnames(home_games_won) <- list("@home" = teams,"@away"=teams) dimnames(away_games_won) <- list("@home" = teams,"@away"=teams) D <- home_games_won + 1i*away_games_won D
Above we see, from row 1 column 2 [36+10i
], that A played at home
against B a total of $36+10=46$ times, winning $36$ and losing $10$;
From row 2 column 1 [11+10i
] we see that B played at home against A
a total of 21 times, winning 11 and losing 10. We can use the Monster
likelihood function and test $H_0\colon M=0$:
H <- ha3(D,lambda=1.01) print(H)
maxp(H)
s <- specificp.gt.test(H,"M",0,n=10) # NB lambda = 1.88 s
We see strong evidence that $M>0$.
s$null_support
We want to plot a profile log-likelihood function, that is, a likelihood function corresponding to the evaluate constrained to having a particular value of $M$:
write(date(),file="marginallambdaprofilelike",append=FALSE) f <- function(lambda){ write(date(),file="marginallambdaprofilelike",append=TRUE) # so we can check its progress return(maxp(ha3(D,lambda=lambda),n=1,give=TRUE)$value) } lam <- seq(from=0.1,to=4.2,len=5) supp_M_eq_0 <- sapply(lam,f)
plot(lam,supp_M_eq_0,type="b",xlab="lambda") plot(lam,supp_M_eq_0-max(supp_M_eq_0),type="b",xlab="lambda") grid() abline(h=c(0,-2))
Now enforce the constraint $\lambda=1$ [and allow $M$ to vary],
thereby reducing the likelihood function from a hyper3
to a hyper2
object:
(H <- home_away(D)) # observe that the monster is 'home' home <- seq(from=0.15,to=0.4,len=15) supp_lambda_eq_0 <- profsupp(H,"home",home,relative=FALSE) plot(home,supp_lambda_eq_0 - max(supp_lambda_eq_0),type='b') abline(h=c(0,-2))
maxp(H)
plot(lam,supp_M_eq_0-max(supp_M_eq_0),type='b',main='Profile likelihood with M=0') grid() plot(home,supp_lambda_eq_0 - max(supp_lambda_eq_0),type='b')
both <- c(supp_lambda_eq_0,NA,supp_M_eq_0) plot(both-max(both,na.rm=TRUE),type='b',col=c(rep("black",length(supp_lambda_eq_0)),"green",rep("red",length(supp_M_eq_0)))) abline(h=c(0,-2))
Support difference:
max(supp_M_eq_0)-max(supp_lambda_eq_0)
exceeding Edwards's two-units-of-support criterion by a comfortable margin. Note that both have equal numbers of degrees of freedom.
likef <- function(M,lambda,...){specificp.test(ha3(D,lambda=lambda),"M",M,...)$null_support} like2 <- function(theta,...){likef(M=theta[1],lambda=theta[2],n=1,...)}
Use it:
Mval <- seq(from=0.01,to=0.65,len=15) lval <- seq(from=0.2,to=5,by = 0.4) # we need lambda=1 (this is the null) jj <- as.matrix(expand.grid(M=Mval,lam=lval)) options(use_alabama = FALSE) # setting to TRUE gives a "initial value not feasible" error, go figure L <- apply(jj,1,like2)
options(digits=2) options(width=150) L <- matrix(L,length(Mval),length(lval)) L <- L-max(L) rownames(L) <- paste("M = ",round(Mval,2)) colnames(L) <- paste("lam=",lval,sep="") L L <- pmax(L,-9) L Mval lval contour(Mval,lval,L,xlab='M',ylab='lambda') abline(h=1,lwd=5) abline(v=0,lwd=5) filled.contour(Mval,lval,L,xlab='M',ylab='lambda') matplot(Mval,L,type='b') matplot(lval,t(L),type='b')
set.seed(0) P <- matrix(0,4,4) teams <- LETTERS[1:4] dimnames(P) <- list("@home" = teams,"@away"=teams) n <- 90 # number of matches between a pair M <- 0.15 # home monster strengths <- (4:1)/10 strengths <- strengths/sum(strengths) names(strengths) <- teams strengths for(home_team in 1:4){ for(away_team in 1:4){ if(home_team != away_team){ prob_home_win <- (strengths[home_team] + M)/(strengths[home_team] + strengths[away_team] + M) home_wins <- rbinom(1,n,prob_home_win) home_losses <- n - home_wins P[home_team,away_team] <- home_wins + 1i*home_losses } } } diag(P) <- NA P HP <- home_away(P)
HP
mH <- maxp(HP,n=3,give=T) mH
home <- seq(from=0.12,to=0.16,len=15) supp_lambda_eq_1 <- profsupp(HP,"home",home,relative=FALSE)
jjH <- ha3(P,lambda=1) # should match HP above jjH as.hyper3(HP) - jjH
plot(home,supp_lambda_eq_1,type='b') plot(home,supp_lambda_eq_1 - max(supp_lambda_eq_1),type='b') abline(h=c(0,-2))
specificp.gt.test(ha3(P,lambda=1.4),"M",0) specificp.gt.test(ha3(P,lambda=1.4),"M",1e-4)
f <- function(lambda){ o <- ha3(P,lambda=lambda) return(specificp.gt.test(o,"M",0)$null_support) } lam <- seq(from=1.9,to=2.4,len=5) supp_M_eq_0 <- sapply(lam,f)
plot(lam,supp_M_eq_0,type='b') plot(lam,supp_M_eq_0-max(supp_M_eq_0),type='b') abline(h=c(0,-2))
jj <- c(supp_M_eq_0,NA,supp_lambda_eq_1) plot(jj-max(jj,na.rm=TRUE),type="b") abline(h=c(0,-2))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.