set.seed(0) knitr::opts_chunk$set(echo = TRUE) library("hyper2") library("magrittr") options("digits" = 5)
knitr::include_graphics(system.file("help/figures/hyper2.png", package = "hyper2"))
To cite the hyper2
package in publications, please use @hankin2017_rmd.
This short document is based on @altham2010, which
discussed an interesting dataset originally presented by Lin
et. al. (2009). It uses more sophisticated idiom furnished by the
hyper2
package, which is an improved and efficient upgrade of the
primitive hyperdirichlet
package. The dataset arose from 69 medical
malpractice claims and refers to the two surgeon–reviewers' answers to
the question "was there a communication breakdown in the hand-off
between physicians caring for the patient?". The rows of Table 1
correspond to the answers that were given by reviewer 1, and the
columns to the answers given by reviewer 2.
handover_table <- as.matrix(read.table("handover.txt")) names(dimnames(handover_table)) <- c("reviewer1","reviewer2") handover_table[3,3] <- NA # structural zero handover_table
Thus we see that in 26 cases, both reviewers answered "yes", and in 18
both said "no". The table also records partial responses that include
observations missed by one or both reviewers. Responses missed by
both reviewers are uninformative and automatically rejected:
structural zeros, in aylmer
terminology. We first apply the Aylmer
test to assess whether the two reviewers give the same proportion of
"yes" responses:
> aylmer.test(a,alternative=function(a){a[1,2]-a[2,1]}) Aylmer functional test for count data data: a p-value = 0.169 alternative hypothesis: test function exceeds observed >
(Compare the McNemar test which gives an exact $p$-value of $7/64\simeq 0.1094$). We follow the previous workers and assume that cases are missing at random and seek a likelihood function on $\theta_{11},\theta_{10},\theta_{01},\theta_{00}$ subject to $\theta_{ij}\geq 0,\sum_{i,j \in\left\lbrace 0,1\right\rbrace}\theta_{ij}=1$. Here "1" means "yes" and "0" means "no", so $\theta_{01}$ is the probability of reviewer 1 responding "no" and reviewer 2 responding "yes".
handover <- hyper2() handover["t11"] %<>% inc(handover_table[1,1]) handover["t10"] %<>% inc(handover_table[1,2]) handover["t01"] %<>% inc(handover_table[2,1]) handover["t00"] %<>% inc(handover_table[2,2]) handover[c("t11","t10")] %<>% inc(handover_table[1,3]) # yes-yes plus yes-no = 2 handover[c("t01","t00")] %<>% inc(handover_table[2,3]) # no-yes plus no-no = 9 handover[c("t11","t01")] %<>% inc(handover_table[3,1]) # yes-yes plus no-yes = 4 handover[c("t10","t00")] %<>% inc(handover_table[3,2]) # yes-no plus no-no = 4 handover <- balance(handover) handover handover_maxp <- maxp(handover) handover_maxp
We may simply test the hypothesis that $\theta_{01} = \theta_{10}$:
samep.test(handover,c("t01","t10"))
Alternatively we might use the Bayesian paradigm and calculate the probability of $\theta_{01} > \theta_{10}$:
probability(handover,disallowed=function(p){p[2]<p[3]},tol=0.05)
the result differing slightly from the previous value, probably due to numerical issues.
We can now consider the distribution of $\psi=\theta_{10}/(\theta_{10}+\theta_{01})$ under the posterior distribution induced by the complete cases:
jj <- rp(300000,H=handover) head(jj) psifun <- function(x){x[3]/(x[2]+x[3])} logoddsfun <- function(x){log(x[2]/x[3])} psi <- apply(jj,1,psifun) log_odds <- apply(jj,1,logoddsfun) psi_complete <- rbeta(length(psi),2,6)
We can now reproduce the four graphics given by Altham and Hankin:
Sample of $\psi=\theta_{10}/(\theta_{10}+\theta_{01})$ under the posterior induced by the whole dataset (histogram), together with the posterior distribution of $\psi$ induced by the complete cases only (a beta distribution with parameters 2 and 6).
```r/(\theta_{10}+\theta_{01})$ under the posterior induced by the whole dataset (histogram), together with the posterior distribution of $\psi$ induced by the complete cases only (a beta distribution with parameters 2 and 6)"} hist(psi,nclass=70,probability=TRUE,main=expression(paste("PDF of ",psi))) p <- seq(from=0,to=1,len=100) points(p,dbeta(p,2,6),type="l") abline(v=psifun(handover_maxp),col="red",lwd=5) segments(x0=0.5,y0=0,y1=1.5,col="blue",lwd=5) legend("topright",lty=1,lwd=c(1,5,5),col=c("black","red","blue"), legend=c( "complete cases", expression(hat(psi) * " (complete cases)"), expression(psi * " = 0.5") ))
```r/(\\theta_{10}+\\theta_{01})$ under the posterior induced by the whole dataset against the posterior distribution of $\\psi$ induced by the complete cases only (a beta distribution with parameters 2 and 6)",cache=TRUE} par(pty='s') qqplot(psi,psi_complete,asp=1,xlim=0:1,ylim=0:1) abline(0,1)
```r/(\theta_{10}+\theta_{01})$ under the posterior induced by the whole dataset, and that of $\psi$ induced by the complete cases only (a beta distribution with parameters 2 and 6)",cache=TRUE} plot(psi,seq_along(psi)/length(psi),type="l",col="black",xlab=expression(psi),ylab="quantile") points(psi_complete,seq_along(psi_complete)/length(psi_complete),type="l",col="red") legend("topleft",lty=1,col=c("black","red"),legend=c("whole dataset","complete cases"))
```r/\\theta_{10})$",cache=TRUE} qqnorm(log_odds,main="QQ plot of log-odds") abline(mean(log_odds),sd(log_odds))
Figures \@ref(fig:histpsi) to \@ref(fig:logoddsqqplot) show different aspects of the dataset.
Following lines create handover.rda
, residing in the data/
directory of the package.
save(handover_table,handover_maxp,handover,file="handover.rda")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.