unalike: Coefficient of unalikeability

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/agree_stats.R

Description

Function to calculate the unalikeability coefficient to quantify the amount of variability in categorical data.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
unalike(x, ...)

## Default S3 method:
unalike(x, ..., method = 1L)

## S3 method for class 'matrix'
unalike(x, ...)

## S3 method for class 'data.frame'
unalike(x, id = "id", rater = "rater",
  score = "score", summary = TRUE, plot = FALSE, ...)

Arguments

x

a vector of categorical data

alternatively, an n x r matrix of n subjects by r raters or an n x m data frame of n subjects; when x is a data frame, the user must specify which columns correspond to id, rater, and score

...

additional arguments passed to or from other methods

method

the method for calculating the unalikeability coefficient; see details

id, rater, score

column names corresponding to IDs, raters, and scores

summary

logical; if TRUE, prints summary statistics for the unalikeability coefficients

plot

logical; if TRUE, prints a heat map of unalikeability coefficients

Details

The coefficient of unalikeability describes a concept of variability for categorical variables and provides a quantitative method for its measurement. A smaller coefficient is better corresponding to less variation in the scores.

For the case of a finite number of observations (n), a finite number of categories (m) and a finite number of objects, k_i, within category i, will allow expression of the coefficient of unalikeablity as:

u = 1 - ∑ p_i ^ 2

where p_i = k_i / n.

The interpretation of u is that it represents the proportion of possible comparisons (pairings) which are unalike. Note that u includes comparisons of each response with itself.

Currently, two methods for calculating the coefficient are implemented. If method = 1, then the formula described above is used. If method = 2, then the formula described in Perry (2005).

Value

A list containing the following:

$method agreement method
$ragree.name method type
$subjects number of subjects
$raters number of raters
$categories number of categories
$value median of all unalikeability coefficients
$summary a data frame with the summary information printed when summary = TRUE
$data a long data frame with all coefficients

Author(s)

Robert Redd rredd@jimmy.harvard.edu

References

Kader, GD. Variability for Categorical Variables. Journal of Statistics Education, Vol. 15, No. 2 (2007).

Perry, M. and Kader, G. Variation as Unalikeability. Teaching Statistics, Vol. 27, No. 2 (2005), pp. 58-60.

See Also

RcmdrPlugin.ISCSS::unalike

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
unalike(1, 2)
unalike(rep(1, 10))


## examples in Kader (2007):
l <- list(
  group1 = rep(c('A', 'B'), c(7, 3)),
  group2 = rep(c('A', 'B'), c(5, 5)),
  group3 = rep(c('A', 'B'), c(1, 9)),
  group4 = rep(c('A', 'B', 'C'), c(2, 3, 5))
)

sapply(l, unalike)


## matrix/data frames are assumed to be subjects x raters
mat <- do.call('cbind', l[1:3])
unalike(mat) ## see Kader


library('irr')
data(diagnoses)

kappam.fleiss(diagnoses)
unalike(as.matrix(diagnoses))

library('ggplot2')
unalike(as.matrix(diagnoses), plot = TRUE)


dat <- data.frame(
  id    = rep(seq.int(nrow(diagnoses)), ncol(diagnoses)),
  rater = rep(names(diagnoses), each = nrow(diagnoses)),
  score = unlist(diagnoses)
)
unalike(dat)

raredd/ragree documentation built on March 25, 2021, 1:42 p.m.