jaccard.test: Test for Jaccard/Tanimoto similarity coefficients

Description Usage Arguments Details Value Optional arguments for method="bootstrap" Optional arguments for method="mca" See Also Examples

View source: R/jaccard.test.R

Description

Compute statistical significance of Jaccard/Tanimoto similarity coefficients between binary vectors, using four different methods.

Usage

1
2
jaccard.test(x, y, method = "mca", px = NULL, py = NULL, verbose = TRUE,
  ...)

Arguments

x

a binary vector (e.g., fingerprint)

y

a binary vector (e.g., fingerprint)

method

a method to compute a p-value ("mca", "bootstrap", "asymptotic", or "exact")

px

probability of successes in x (optional)

py

probability of successes in y (optional)

verbose

whether to print progress messages

...

optional arguments for specific computational methods

Details

There exist four methods to compute p-values of Jaccard/Tanimoto similarity coefficients: mca, bootstrap, asymptotic, and exact. This is simply a wrapper function for corresponding four functions in this package: jaccard.test.mca, jaccard.test.bootstrap, jaccard.test.asymptotic, and jaccard.test.exact.

We recommand using either mca or bootstrap methods, since the exact solution is slow for a moderately large vector and asymptotic approximation may be inaccurate depending on the input vector size. The bootstrap method uses resampling with replacement binary vectors to compute a p-value (see optional arguments). The mca method uses the measure concentration algorithm that estimates the multinomial distribution with a known error bound (specified by an optional argument accuracy).

Value

jaccard.test returns a list mainly consisting of

statistics

centered Jaccard/Tanimoto similarity coefficient

pvalue

p-value

expectation

expectation

Optional arguments for method="bootstrap"

fix

whether to fix (i.e., not resample) x and/or y

B

a total bootstrap iteration

seed

a seed for a random number generator

Optional arguments for method="mca"

accuracy

an error bound on approximating a multinomial distribution

error.type

an error type on approximating a multinomial distribution ("average", "upper", "lower")

seed

a seed for the random number generator.

See Also

jaccard.test.bootstrap jaccard.test.mca jaccard.test.exact jaccard.test.asymptotic

Examples

1
2
3
4
5
6
7
set.seed(1234)
x = rbinom(100,1,.5)
y = rbinom(100,1,.5)
jaccard.test(x,y,method="bootstrap")
jaccard.test(x,y,method="mca")
jaccard.test(x,y,method="exact")
jaccard.test(x,y,method="asymptotic")

ncchung/jaccard documentation built on Dec. 31, 2019, 8:20 p.m.