MCMCpaircompare: Markov Chain Monte Carlo for a Pairwise Comparisons Model...

MCMCpaircompareR Documentation

Markov Chain Monte Carlo for a Pairwise Comparisons Model with Probit Link

Description

This function generates a sample from the posterior distribution of a model for pairwise comparisons data with a probit link. Thurstone's model is a special case of this model when the α parameter is fixed at 1.

Usage

MCMCpaircompare(
  pwc.data,
  theta.constraints = list(),
  alpha.fixed = FALSE,
  burnin = 1000,
  mcmc = 20000,
  thin = 1,
  verbose = 0,
  seed = NA,
  alpha.start = NA,
  a = 0,
  A = 0.25,
  store.theta = TRUE,
  store.alpha = FALSE,
  ...
)

Arguments

pwc.data

A data.frame containing the pairwise comparisons data. Each row of pwc.data corresponds to a single pairwise comparison. pwc.data needs to have exactly four columns. The first column contains a unique identifier for the rater. Column two contains the unique identifier for the first item being compared. Column three contains the unique identifier for the second item being compared. Column four contains the unique identifier of the item selected from the two items being compared. If a tie occurred, the entry in the fourth column should be NA. For applications without raters (such as sports competitions) all entries in the first column should be set to a single value and alpha.fixed (see below) should be set to TRUE. The identifiers in columns 2 through 4 must start with a letter. Examples are provided below.

theta.constraints

A list specifying possible simple equality or inequality constraints on the item parameters. A typical entry in the list has one of three forms: itemname=c which will constrain the item parameter for the item named itemname to be equal to c, itemname="+" which will constrain the item parameter for the item named itemname to be positive, and itemname="-" which will constrain the item parameter for the item named itemname to be negative.

alpha.fixed

Should alpha be fixed to a constant value of 1 for all raters? Default is FALSE. If set to FALSE, an alpha value is estimated for each rater.

burnin

The number of burn-in iterations for the sampler.

mcmc

The number of Gibbs iterations for the sampler.

thin

The thinning interval used in the simulation. The number of Gibbs iterations must be divisible by this value.

verbose

A switch which determines whether or not the progress of the sampler is printed to the screen. If verbose is greater than 0 output is printed to the screen every verboseth iteration.

seed

The seed for the random number generator. If NA, the Mersenne Twister generator is used with default seed 12345; if an integer is passed it is used to seed the Mersenne twister. The user can also pass a list of length two to use the L'Ecuyer random number generator, which is suitable for parallel computation. The first element of the list is the L'Ecuyer seed, which is a vector of length six or NA (if NA a default seed of rep(12345,6) is used). The second element of list is a positive substream number. See the MCMCpack specification for more details.

alpha.start

The starting value for the alpha vector. This can either be a scalar or a column vector with dimension equal to the number of alphas. If this takes a scalar value, then that value will serve as the starting value for all of the alphas. The default value of NA will set the starting value of each alpha parameter to 1.

a

The prior mean of alpha. Must be a scalar. Default is 0.

A

The prior precision of alpha. Must be a positive scalar. Default is 0.25 (prior variance is 4).

store.theta

Should the theta draws be returned? Default is TRUE.

store.alpha

Should the alpha draws be returned? Default is FALSE.

...

further arguments to be passed

Details

MCMCpaircompare uses the data augmentation approach of Albert and Chib (1993). The user supplies data and priors, and a sample from the posterior is returned as an mcmc object, which can be subsequently analyzed in the coda package.

The simulation is done in compiled C++ code to maximize efficiency.

Please consult the coda package documentation for a comprehensive list of functions that can be used to analyze the posterior sample.

The model takes the following form:

i = 1,...,I \ \ \ \ (raters)

j = 1,...,J \ \ \ \ (items)

Y_{ijj'} = 1 \ \ if \ \ i \ \ chooses \ \ j \ \ over \ \ j'

Y_{ijj'} = 0 \ \ if \ \ i \ \ chooses \ \ j' \ \ over \ \ j

Y_{ijj'} = NA \ \ if \ \ i \ \ chooses \ \ neither

Pr(Y_{ijj'} = 1) = Φ( α_{i} [θ_{j} - θ_{ j'} ] )

The following Gaussian priors are assumed:

α_i \sim \mathcal{N}(a, A^{-1})

θ_j \sim \mathcal{N}(0, 1)

For identification, some θ_js are truncated above or below 0, or fixed to constants.

Value

An mcmc object that contains the posterior sample. This object can be summarized by functions provided by the coda package.

References

Albert, J. H. and S. Chib. 1993. “Bayesian Analysis of Binary and Polychotomous Response Data.” J. Amer. Statist. Assoc. 88, 669-679

Yu, Qiushi and Kevin M. Quinn. 2021. “A Multidimensional Pairwise Comparison Model for Heterogeneous Perception with an Application to Modeling the Perceived Truthfulness of Public Statements on COVID-19.” University of Michigan Working Paper.

Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park. 2011. “MCMCpack: Markov Chain Monte Carlo in R.”, Journal of Statistical Software. 42(9): 1-21. doi: 10.18637/jss.v042.i09.

Daniel Pemstein, Kevin M. Quinn, and Andrew D. Martin. 2007. Scythe Statistical Library 1.0. http://scythe.lsa.umich.edu.

Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines. 2006. “Output Analysis and Diagnostics for MCMC (CODA)”, R News. 6(1): 7-11. https://CRAN.R-project.org/doc/Rnews/Rnews_2006-1.pdf.

See Also

plot.mcmc,summary.mcmc, MCMCpaircompare2d, MCMCpaircompare2dDP

Examples


  ## Not run: 
  ## Euro 2016 example
  data(Euro2016)

posterior1 <- MCMCpaircompare(pwc.data=Euro2016,
                              theta.constraints=list(Ukraine="-",
                                                     Portugal="+"),
                              alpha.fixed=TRUE,
                              verbose=10000,
                              burnin=10000, mcmc=500000, thin=100,
                              store.theta=TRUE, store.alpha=FALSE)

## alternative identification constraints
posterior2 <- MCMCpaircompare(pwc.data=Euro2016,
                              theta.constraints=list(Ukraine="-",
                                                     Portugal=1),
                              alpha.fixed=TRUE,
                              verbose=10000,
                              burnin=10000, mcmc=500000, thin=100,
                              store.theta=TRUE, store.alpha=FALSE)








## a synthetic data example with estimated rater-specific parameters
set.seed(123)

I <- 65  ## number of raters
J <- 50 ## number of items to be compared


## raters 1 to 5 have less sensitivity to stimuli than raters 6 through I
alpha.true <- c(rnorm(5, m=0.2, s=0.05), rnorm(I - 5, m=1, s=0.1))
theta.true <- sort(rnorm(J, m=0, s=1))

n.comparisons <- 125 ## number of pairwise comparisons for each rater

## generate synthetic data according to the assumed model
rater.id <- NULL
item.1.id <- NULL
item.2.id <- NULL
choice.id <- NULL
for (i in 1:I){
    for (c in 1:n.comparisons){
        rater.id <- c(rater.id, i+100)
        item.numbers <- sample(1:J, size=2, replace=FALSE)
        item.1 <- item.numbers[1]
        item.2 <- item.numbers[2]
        item.1.id <- c(item.1.id, item.1)
        item.2.id <- c(item.2.id, item.2)
        eta <- alpha.true[i] * (theta.true[item.1] - theta.true[item.2])
        prob.item.1.chosen <- pnorm(eta)
        u <- runif(1)
        if (u <= prob.item.1.chosen){
            choice.id <- c(choice.id, item.1)
        }
        else{
            choice.id <- c(choice.id, item.2)
        }
    }
}
item.1.id <- paste("item", item.1.id+100, sep=".")
item.2.id <- paste("item", item.2.id+100, sep=".")
choice.id <- paste("item", choice.id+100, sep=".")

sim.data <- data.frame(rater.id, item.1.id, item.2.id, choice.id)


## fit the model
posterior <- MCMCpaircompare(pwc.data=sim.data,
                             theta.constraints=list(item.101=-2,
                                                    item.150=2),
                             alpha.fixed=FALSE,
                             verbose=10000,
                             a=0, A=0.5,
                             burnin=10000, mcmc=200000, thin=100,
                             store.theta=TRUE, store.alpha=TRUE)

theta.draws <- posterior[, grep("theta", colnames(posterior))]
alpha.draws <- posterior[, grep("alpha", colnames(posterior))]

theta.post.med <- apply(theta.draws, 2, median)
alpha.post.med <- apply(alpha.draws, 2, median)

theta.post.025 <- apply(theta.draws, 2, quantile, prob=0.025)
theta.post.975 <- apply(theta.draws, 2, quantile, prob=0.975)
alpha.post.025 <- apply(alpha.draws, 2, quantile, prob=0.025)
alpha.post.975 <- apply(alpha.draws, 2, quantile, prob=0.975)

## compare estimates to truth
par(mfrow=c(1,2))
plot(theta.true, theta.post.med, xlim=c(-2.5, 2.5), ylim=c(-2.5, 2.5),
     col=rgb(0,0,0,0.3))
segments(x0=theta.true, x1=theta.true,
         y0=theta.post.025, y1=theta.post.975,
         col=rgb(0,0,0,0.3)) 
abline(0, 1, col=rgb(1,0,0,0.5))

plot(alpha.true, alpha.post.med, xlim=c(0, 1.2), ylim=c(0, 3),
     col=rgb(0,0,0,0.3))
segments(x0=alpha.true, x1=alpha.true,
         y0=alpha.post.025, y1=alpha.post.975,
         col=rgb(0,0,0,0.3)) 
abline(0, 1, col=rgb(1,0,0,0.5))


## End(Not run)


MCMCpack documentation built on April 13, 2022, 5:16 p.m.