This vignette demonstrates functions in the data range.

library(shrt)

ji

ji computes the Jaccard index of two sets. The Jaccard index is a commonly used metrics in data science. It is defined as the size of the set intersection, normalized by the size of the union.

setA = c("alpha", "beta")
setB = c("beta", "gamma", "delta")
ji(setA, setB)

oi

oi computes the overlap index of two sets. This index normalizes the size of set intersection by the size of the first set only. (This is in contrast with the Jaccard index, which normalizes the size of the union of the two sets).

setA = c("alpha", "beta")
setB = c("beta", "gamma", "delta")
oi(setA, setB)

Note that the overlap index is not symmetrical.

oi(setB, setA)

setsummary

setsummary computes an overview of set membership. It takes two vectors that represent sets of objects. It outputs a list with summary statistics: the set of private members in each of the two inputs, the set of common objects, and a vector with the element counts.

setA = c("alpha", "beta")
setB = c("beta", "gamma", "delta")
setsummary(setA, setB)


tkonopka/shrt documentation built on March 5, 2020, 2:51 p.m.