similarity: Measures of Similarity

similarityR Documentation

Measures of Similarity

Description

Computes several measures of similarity (see Choi, Cha, & Tappert, 2010 for additional measures)

Usage

similarity(
  data,
  method = c("angular", "cor", "cosine", "euclid", "faith", "jaccard", "phi", "rr")
)

Arguments

data

Matrix or data frame. A binarized dataset of verbal fluency or linguistic data

method

Character. Type of similarity measure to compute.

Below are the definitions for each bin:

1 0
1 a b a+b (R1)
0 c d c+d (R2)
a+c b+d a+b+c+d (N)
(C1) (C2) (N)

Options include:

  • "angular" = 1 - (2 * acos(cosine similarity) / \pi)

  • "cosine" = a / \sqrt{(a + b)(a + c)}

  • "faith" = a + 0.5d / a + b + c + d

  • "jaccard" = a / a + b + c

  • "phi" and "cor" = ad - bc / \sqrt(R1 x R2 x C1 x C2)

  • "rr" = a / a + b + c + d

Value

A symmetric similarity matrix

Author(s)

Alexander Christensen <alexpaulchristensen@gmail.com>

References

Choi, S. S., Cha, S. H., & Tappert, C. C. (2010). A survey of binary similarity and distance measures. Journal of Systemics, Cybernetics and Informatics, 8, 43-48.

Examples

# Simulate Datasets
one <- sim.fluency(10)

# Compute similarity matrix
cos <- similarity(one, method = "cosine")


SemNeT documentation built on Aug. 12, 2023, 5:06 p.m.