calcOverlap: Calculate Overlap

View source: R/calcOverlap.R

calcOverlapR Documentation

Calculate Overlap

Description

Calculate the Jaccard Index of two different sets. Can also calculate BUB.

Usage

calcOverlap(set1_v, set2_v, total_v = NA, digits_v = 4, fullOut_v = T)

Arguments

set1_v

Character vector of values to compare

set2_v

Character vector of values to compare

total_v

Optional global set to compare set1 and set2 against. Only used for BUB. If 'NA' (default), will just calculate Jaccard index.

digits_v

how many digits to round output to. Default is 4.

fullOut_v

logical. TRUE - output number of elements in 'set1', 'set2', their intersection, their union, and number of elements in 'total', in addition to jaccard and BUB values. FALSE - just output jaccard and BUB values

Details

This will calculate overlap between two sets by comparing the identities of each element within set1 and set2. Jaccard Index is the Intersection of the two sets divided by the union of the two sets. BUB (Baroni Urbani Binary Index) is similar, but compares against a global set (rather than just the union of 1 and 2). This allows for weighting two samples as more similar if a they also cover a greater amount of the global set.

Examples

A <- 1:100
B <- 25:75
C <- 60:150
TOT <- union(union(A, B), C)
calcOverlap(A, B)
calcOverlap(A, C)
calcOverlap(A, B, TOT)
calcOverlap(A, C, TOT)

weshorton/wrh.rUtils documentation built on Oct. 28, 2024, 7:24 a.m.