set.seed(0)
knitr::opts_chunk$set(echo = TRUE)
library("hyper2")
library("magrittr")
options("digits" = 5)
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))

To cite the hyper2 package in publications, please use @hankin2017_rmd. This short document is based on @altham2010, which discussed an interesting dataset originally presented by Lin et. al. (2009). It uses more sophisticated idiom furnished by the hyper2 package, which is an improved and efficient upgrade of the primitive hyperdirichlet package. The dataset arose from 69 medical malpractice claims and refers to the two surgeon–reviewers' answers to the question "was there a communication breakdown in the hand-off between physicians caring for the patient?". The rows of Table 1 correspond to the answers that were given by reviewer 1, and the columns to the answers given by reviewer 2.

handover_table <- as.matrix(read.table("handover.txt"))
names(dimnames(handover_table)) <- c("reviewer1","reviewer2")
handover_table[3,3] <- NA # structural zero
handover_table

Thus we see that in 26 cases, both reviewers answered "yes", and in 18 both said "no". The table also records partial responses that include observations missed by one or both reviewers. Responses missed by both reviewers are uninformative and automatically rejected: structural zeros, in aylmer terminology. We first apply the Aylmer test to assess whether the two reviewers give the same proportion of "yes" responses:

> aylmer.test(a,alternative=function(a){a[1,2]-a[2,1]})

    Aylmer functional test for count data

data:  a
p-value = 0.169
alternative hypothesis: test function exceeds observed

> 

(Compare the McNemar test which gives an exact $p$-value of $7/64\simeq 0.1094$). We follow the previous workers and assume that cases are missing at random and seek a likelihood function on $\theta_{11},\theta_{10},\theta_{01},\theta_{00}$ subject to $\theta_{ij}\geq 0,\sum_{i,j \in\left\lbrace 0,1\right\rbrace}\theta_{ij}=1$. Here "1" means "yes" and "0" means "no", so $\theta_{01}$ is the probability of reviewer 1 responding "no" and reviewer 2 responding "yes".

handover <- hyper2()
handover["t11"] %<>% inc(handover_table[1,1])
handover["t10"] %<>% inc(handover_table[1,2])
handover["t01"] %<>% inc(handover_table[2,1])
handover["t00"] %<>% inc(handover_table[2,2])

handover[c("t11","t10")] %<>% inc(handover_table[1,3])  # yes-yes plus yes-no  = 2 
handover[c("t01","t00")] %<>% inc(handover_table[2,3])  #  no-yes plus  no-no  = 9 
handover[c("t11","t01")] %<>% inc(handover_table[3,1])  # yes-yes plus  no-yes = 4 
handover[c("t10","t00")] %<>% inc(handover_table[3,2])  # yes-no  plus  no-no  = 4 
handover <- balance(handover)
handover
handover_maxp <- maxp(handover)
handover_maxp

We may simply test the hypothesis that $\theta_{01} = \theta_{10}$:

samep.test(handover,c("t01","t10"))

Alternatively we might use the Bayesian paradigm and calculate the probability of $\theta_{01} > \theta_{10}$:

probability(handover,disallowed=function(p){p[2]<p[3]},tol=0.05)

the result differing slightly from the previous value, probably due to numerical issues.

We can now consider the distribution of $\psi=\theta_{10}/(\theta_{10}+\theta_{01})$ under the posterior distribution induced by the complete cases:

jj <- rp(300000,H=handover)
head(jj)
psifun <- function(x){x[3]/(x[2]+x[3])}
logoddsfun <- function(x){log(x[2]/x[3])}
psi <- apply(jj,1,psifun)
log_odds <- apply(jj,1,logoddsfun)
psi_complete <- rbeta(length(psi),2,6)

We can now reproduce the four graphics given by Altham and Hankin:

Sample of $\psi=\theta_{10}/(\theta_{10}+\theta_{01})$ under the posterior induced by the whole dataset (histogram), together with the posterior distribution of $\psi$ induced by the complete cases only (a beta distribution with parameters 2 and 6).

```r/(\theta_{10}+\theta_{01})$ under the posterior induced by the whole dataset (histogram), together with the posterior distribution of $\psi$ induced by the complete cases only (a beta distribution with parameters 2 and 6)"} hist(psi,nclass=70,probability=TRUE,main=expression(paste("PDF of ",psi))) p <- seq(from=0,to=1,len=100) points(p,dbeta(p,2,6),type="l") abline(v=psifun(handover_maxp),col="red",lwd=5) segments(x0=0.5,y0=0,y1=1.5,col="blue",lwd=5) legend("topright",lty=1,lwd=c(1,5,5),col=c("black","red","blue"), legend=c( "complete cases", expression(hat(psi) * " (complete cases)"), expression(psi * " = 0.5") ))

```r/(\\theta_{10}+\\theta_{01})$ under the posterior induced  by the whole dataset against the posterior distribution of $\\psi$ induced by the complete cases only (a beta distribution with parameters 2 and 6)",cache=TRUE}
par(pty='s')
qqplot(psi,psi_complete,asp=1,xlim=0:1,ylim=0:1)
abline(0,1)

```r/(\theta_{10}+\theta_{01})$ under the posterior induced by the whole dataset, and that of $\psi$ induced by the complete cases only (a beta distribution with parameters 2 and 6)",cache=TRUE} plot(psi,seq_along(psi)/length(psi),type="l",col="black",xlab=expression(psi),ylab="quantile") points(psi_complete,seq_along(psi_complete)/length(psi_complete),type="l",col="red") legend("topleft",lty=1,col=c("black","red"),legend=c("whole dataset","complete cases"))

```r/\\theta_{10})$",cache=TRUE}
qqnorm(log_odds,main="QQ plot of log-odds")
abline(mean(log_odds),sd(log_odds))

Figures \@ref(fig:histpsi) to \@ref(fig:logoddsqqplot) show different aspects of the dataset.

Package dataset {-}

Following lines create handover.rda, residing in the data/ directory of the package.

save(handover_table,handover_maxp,handover,file="handover.rda")

References {-}



RobinHankin/hyper2 documentation built on May 6, 2024, 4:25 p.m.