Calculates the Similarity or Dissimilarity Between Two Fingerprints

Description

A number of distance metrics can be calculated for binary fingerprints. Some of these are actually similarity metrics and thus represent the reverse of a distance metric.

The following are distance (dissimilarity) metrics

  • Hamming

  • Mean Hamming

  • Soergel

  • Pattern Difference

  • Variance

  • Size

  • Shape

The following metrics are similarity metrics and so the distance can be obtained by subtracting the value fom 1.0

  • Tanimoto

  • Dice

  • Modified Tanimoto

  • Simple

  • Jaccard

  • Russel-Rao

  • Rodgers Tanimoto

  • Cosine

  • Achiai

  • Carbo

  • Baroniurbanibuser

  • Kulczynski2

Finally the method also provides a set of composite and asymmetric distance metrics

  • Hamann

  • Yule

  • Pearson

  • Dispersion

  • McConnaughey

  • Stiles

  • Simpson

  • Petke

The default metric is the Tanimoto coefficient.

Usage

1
distance(fp1, fp2, method)

Arguments

fp1

An object of class fingerprint

fp2

An object of class fingerprint

method

The type of distance metric desired. Partial matching is supported and the deault is tanimoto. Alternative values are

  • euclidean

  • hamming

  • meanHamming

  • soergel

  • patternDifference

  • variance

  • size

  • shape

  • jaccard

  • dice

  • mt

  • simple

  • russelrao

  • rodgerstanimoto

  • cosine

  • achiai

  • carbo

  • baroniurbanibuser

  • kulczynski2

  • hamann

  • yule

  • pearson

  • mcconnaughey

  • stiles

  • simpson

  • petke

Value

Numeric value representing the distance in the specified metric between the supplied fingerprint objects

Author(s)

Rajarshi Guha rguha@indiana.edu

References

Fligner, M.A.; Verducci, J.S.; Blower, P.E.; A Modification of the Jaccard-Tanimoto Similarity Index for Diverse Selection of Chemical Compounds Using Binary Strings, Technometrics, 2002, 44(2), 110-119

Monve, V.; Introduction to Similarity Searching in Chemistry, MATCH - Comm. Math. Comp. Chem., 2004, 51, 7-38

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# make a 2 fingerprint vectors
fp1 <- new("fingerprint", nbit=6, bits=c(1,2,5,6))
fp2 <- new("fingerprint", nbit=6, bits=c(1,2,5,6))

# calculate the tanimoto coefficient
distance(fp1,fp2) # should be 1

# Invert the second fingerprint
fp3 <- !fp2

distance(fp1,fp3) # should be 0