distance-methods: Calculates the Similarity or Dissimilarity Between Two...

distance-methodsR Documentation

Calculates the Similarity or Dissimilarity Between Two Fingerprints

Description

A number of distance metrics can be calculated for binary fingerprints. Some of these are actually similarity metrics and thus represent the reverse of a distance metric.

The following are distance (dissimilarity) metrics

  • Hamming

  • Mean Hamming

  • Soergel

  • Pattern Difference

  • Variance

  • Size

  • Shape

The following metrics are similarity metrics and so the distance can be obtained by subtracting the value fom 1.0

  • Tanimoto

  • Dice

  • Modified Tanimoto

  • Simple

  • Jaccard

  • Russel-Rao

  • Rodgers Tanimoto

  • Cosine

  • Achiai

  • Carbo

  • Baroniurbanibuser

  • Kulczynski2

  • Robust

Finally the method also provides a set of composite and asymmetric distance metrics

  • Hamann

  • Yule

  • Pearson

  • Dispersion

  • McConnaughey

  • Stiles

  • Simpson

  • Petke

  • Tversky

The default metric is the Tanimoto coefficient.

Usage

distance(fp1, fp2, method, a, b)

Arguments

fp1

An object of class fingerprint or featvec

fp2

An object of class fingerprint or featvec

a

Parameter for the Tversky index

b

Parameter for the Tversky index

method

The type of distance metric desired. Partial matching is supported and the deault is tanimoto. Alternative values are

  • euclidean

  • hamming

  • meanHamming

  • soergel

  • patternDifference

  • variance

  • size

  • shape

  • jaccard

  • dice

  • mt

  • simple

  • russelrao

  • rodgerstanimoto

  • cosine

  • achiai

  • carbo

  • baroniurbanibuser

  • kulczynski2

  • robust

  • hamann

  • yule

  • pearson

  • mcconnaughey

  • stiles

  • simpson

  • petke

  • tversky

If the two fingerprints are of class featvec then the following methods may be specified: tanimoto, robust and dice.

Value

Numeric value representing the distance in the specified metric between the supplied fingerprint objects

Methods

signature(fp1 = "featvec", fp2 = "featvec", method = "character", a = "missing", b = "missing")

Similarity method for feature vector type fingerprints, supporting tanimoto, robust and dice metrics.

signature(fp1 = "featvec", fp2 = "featvec", method = "missing", a = "missing", b = "missing")

Evaluate Tanimoto similarity between two feature vector fingerprints

signature(fp1 = "fingerprint", fp2 = "fingerprint", method = "character", a = "missing", b = "missing")

Evaluate similarity (or dissimilrity) between two binary fingerprints. See below for a list of possible similarity (or dissimilarity) metrics

signature(fp1 = "fingerprint", fp2 = "fingerprint", method = "character", a = "numeric", b = "numeric")

Evaluate Tversky similarity between two binary fingerprints.

signature(fp1 = "fingerprint", fp2 = "fingerprint", method = "missing", a = "missing", b = "missing")

Evaluate Tanimoto similarity between two binary fingerprints

Author(s)

Rajarshi Guha rajarshi.guha@gmail.com

References

Fligner, M.A.; Verducci, J.S.; Blower, P.E.; A Modification of the Jaccard-Tanimoto Similarity Index for Diverse Selection of Chemical Compounds Using Binary Strings, Technometrics, 2002, 44(2), 110-119

Monve, V.; Introduction to Similarity Searching in Chemistry, MATCH - Comm. Math. Comp. Chem., 2004, 51, 7-38

Examples

# make a 2 fingerprint vectors
fp1 <- new("fingerprint", nbit=6, bits=c(1,2,5,6))
fp2 <- new("fingerprint", nbit=6, bits=c(1,2,5,6))

# calculate the tanimoto coefficient
distance(fp1,fp2) # should be 1

# Invert the second fingerprint
fp3 <- !fp2

distance(fp1,fp3) # should be 0

CDK-R/fingerprint documentation built on Oct. 23, 2022, 1:34 p.m.