
Calculate model-based kappa of agreement and association and their standard errors for multiple raters each assessing multiple cases, as seen in: - Mitani, A. A., & Nelson, K. P. (2017). Modeling Agreement between Binary Classifications of Multiple Raters in R and SAS. Journal of Modern Applied Statistical Methods, 16(2), 277-309. doi: 10.22237/jmasm/1509495300 - Mitani, A. A., Freer, P. E. & Nelson, K. P. (2017) Summary measures of agreement and association between many raters' ordinal classifications. Annals of Epidemiology, 27(10), 677-685. doi: 10.1016/j.annepidem.2017.09.001 - Nelson, K. P., and Edwards, D. (2015) Measures of agreement between many raters for ordinal classifications. Statistics in Medicine, 34: 3116–3132. doi: 10.1002/sim.6546 - Nelson, K. P. and Edwards, D. (2008), On population‐based measures of agreement for binary classifications. Canadian Journal of Statistics, 36: 411-426. doi: 10.1002/cjs.5550360306

R installation Instructions

Copy and paste the following code to install modelkappa package in R. If this is your first time installing the devtools package, you may need to restart R after executing the first line.


About the data sets

'bcdata': Data from Bladder Cancer Screening Study

This is a dataset containing evaluations by eight genitourinary pathologists reviewing 25 bladder cancer specimens. Each pathologist provided a binary classification for each specimen according to whether or not they considered the sample to be non-invasive or invasive bladder cancer. More details of the study are available from: Compérat, E. et. al. (2013) An interobserver reproducibility study on invasiveness of bladder cancer using virtual microscopy and heatmaps. Histopathology, 63, 756– 766. doi: org/10.1111/his.12214

To view the data dictionary, see the documentation by typing


'holmdata': Classification of carcinoma in situ of the uterine cervix

This is a dataset containing evaluations by seven pathologists each independently classifying 118 histologic slides into one of the five ordinal categories of increasing disease severity. This data set is regarded as a classic example to evaluate agreement between multiple raters each classifying a sample of subjects' test results according to an ordinal classification scale. More details of the study are available from: Holmquist, N.D., McMahan, C.A., and Williams, O.D. (1967) Variability in classification of carcinoma in situ of the uterine cervix. Arch. Pathol, 84: 334-345. PMID: 6045446

To view the data dictionary, see the documentation by typing


Use modelkappa function to calculate model-based agreement (and association)

modelkappa(data=holmdata, category=Cat, item=Item, rater=Rater)

The output will include Number of observations, Number of categories, Number of items, Number of raters, Model-based kappa for agreement, its standard errors and 95% confidence intervals. If number of categories is >2, then will also output Model-based kappa for association, its standard errors and 95% confidence intervals.



To cite this repository, please use the following BibTex code

  author       = {Aya Mitani},
  title        = {AyaMitani/modelkappa: v1},
  month        = nov,
  year         = 2019,
  publisher    = {Zenodo},
  version      = {v1.0},
  doi          = {10.5281/zenodo.3546381},
  url          = {}


AyaMitani/ModelKappa documentation built on Nov. 20, 2019, 6:21 a.m.