Description Usage Arguments Value Author(s) References See Also Examples
Search function for fingerprints, such as PubChem or atom pair fingerprints. Enables structure similarity comparisons, searching and clustering.
1 2 3 |
x |
Query molecule of class |
y |
Subject molecule(s) of class |
sorted |
return results sorted or unsorted |
method |
Similarity coefficient to return. One can choose here from several
predefined similarity measures: "Tanimoto" (default), "Euclidean", "Tversky" or
"Dice". Alternatively, one can pass on any custom similarity function containing the
arguments a, b, c and d. For instance, one can define "myfct <- function(a, b, c, d)
c/(alpha*a + beta*b + c)" and then pass on The predefined methods will run a C++ version of this function which is about twice as fast as the R version. When a custom similarity function is given however, it will fall back to using the R version. |
addone |
Value to add to numerator and denominator of similarity coefficient to avoid devision by zero when fingerprint(s) contain only "off-bits" (zeros). Note: if |
cutoff |
allows to restrict results to hits above a similarity cutoff value; default |
top |
allows to restrict number of subject molecules to return; default |
alpha |
Only used when method="Tversky". Allows to specify the weighting variable 'alpha' of the Tversky index: c/(alpha*a + beta*b + c) |
beta |
Only used when method="Tversky". Allows to specify the weighting variable 'beta' of the Tversky index. |
parameters |
Parameters for computing Z-scores, E-values, and p-values. Pass this data if you want these
scores returned. This data can be generated with the |
scoreType |
If using the |
Returns numeric vector
with similarity coefficients as values and compound identifiers as names.
Thomas Girke, Kevin Horan
Tanimoto similarity coefficient: Tanimoto TT (1957) IBM Internal Report 17th Nov see also Jaccard P (1901) Bulletin del la Societe Vaudoisedes Sciences Naturelles 37, 241-272.
PubChem fingerprint specification: ftp://ftp.ncbi.nih.gov/pubchem/specifications/pubchem_fingerprints.txt
Functions: fp2bit
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | ## Load PubChem SDFset sample
data(sdfsample); sdfset <- sdfsample
cid(sdfset) <- sdfid(sdfset)
## Convert base 64 encoded fingerprints to character vector or binary matrix
fpset <- fp2bit(sdfset)
## Alternatively, one can use atom pair fingerprints
## Not run:
fpset <- desc2fp(sdf2ap(sdfset))
## End(Not run)
## Pairwise compound structure comparisons
fpSim(x=fpset[1], y=fpset[2], method="Tanimoto")
## Structure similarity searching: x is query and y is fingerprint database
fpSim(x=fpset[1], y=fpset)
## Controlling the output
fpSim(x=fpset[1], y=fpset, method="Tversky", cutoff=0.4, top=4, alpha=0.5, beta=1)
## Use custom distance function
myfct <- function(a, b, c, d) c/(a+b+c+d)
fpSim(x=fpset[1], y=fpset, method=myfct)
## Compute fingerprint-based Tanimoto similarity matrix
simMA <- sapply(cid(fpset), function(x) fpSim(x=fpset[x], fpset, sorted=FALSE))
## Hierarchical clustering with simMA as input
hc <- hclust(as.dist(1-simMA), method="single")
## Plot hierarchical clustering tree
plot(as.dendrogram(hc), edgePar=list(col=4, lwd=2), horiz=TRUE)
|
650002
0.3947368
650001 650094 650004 650085 650077 650079 650092 650074
1.0000000 0.5375000 0.5000000 0.5000000 0.4861111 0.4583333 0.4555556 0.4444444
650104 650102 650082 650072 650054 650011 650016 650087
0.4444444 0.4342105 0.4324324 0.4285714 0.4235294 0.4193548 0.4133333 0.4117647
650033 650048 650002 650039 650091 650032 650089 650056
0.4000000 0.3977273 0.3947368 0.3846154 0.3815789 0.3775510 0.3766234 0.3750000
650049 650067 650050 650090 650024 650020 650070 650097
0.3733333 0.3658537 0.3648649 0.3647059 0.3636364 0.3552632 0.3500000 0.3456790
650069 650046 650015 650075 650003 650071 650098 650012
0.3406593 0.3404255 0.3333333 0.3333333 0.3250000 0.3235294 0.3235294 0.3205128
650023 650096 650058 650026 650005 650065 650066 650041
0.3194444 0.3194444 0.3157895 0.3132530 0.3076923 0.3058824 0.3058824 0.3000000
650080 650044 650009 650013 650068 650078 650099 650061
0.2987013 0.2962963 0.2941176 0.2916667 0.2891566 0.2857143 0.2857143 0.2835821
650062 650007 650093 650105 650086 650019 650040 650052
0.2835821 0.2763158 0.2763158 0.2745098 0.2739726 0.2727273 0.2727273 0.2727273
650034 650028 650021 650031 650103 650038 650059 650060
0.2656250 0.2637363 0.2631579 0.2575758 0.2571429 0.2535211 0.2535211 0.2535211
650008 650083 650035 650037 650076 650063 650064 650017
0.2525253 0.2500000 0.2465753 0.2465753 0.2391304 0.2361111 0.2361111 0.2352941
650073 650100 650022 650029 650106 650030 650010 650045
0.2325581 0.2314815 0.2266667 0.2222222 0.2195122 0.2093023 0.2054795 0.2051282
650043 650025 650095 650006 650027 650036 650042 650081
0.2000000 0.1971831 0.1940299 0.1911765 0.1875000 0.1857143 0.1830986 0.1794872
650101 650014 650047 650088
0.1704545 0.1470588 0.1408451 0.0468750
650001 650034 650088 650031
1.0000000 1.0000000 1.0000000 0.8947368
650001 650094 650072 650092 650085 650011
0.061523438 0.041015625 0.040039062 0.039062500 0.038085938 0.037109375
650032 650074 650054 650004 650077 650048
0.035156250 0.034179688 0.034179688 0.034179688 0.033203125 0.033203125
650102 650079 650104 650082 650046 650090
0.031250000 0.031250000 0.030273438 0.030273438 0.030273438 0.029296875
650069 650016 650099 650067 650056 650039
0.029296875 0.029296875 0.028320312 0.028320312 0.028320312 0.028320312
650033 650015 650002 650091 650089 650105
0.028320312 0.028320312 0.028320312 0.027343750 0.027343750 0.026367188
650097 650087 650070 650049 650024 650075
0.026367188 0.026367188 0.026367188 0.026367188 0.026367188 0.025390625
650050 650020 650066 650065 650026 650003
0.025390625 0.025390625 0.024414062 0.024414062 0.024414062 0.024414062
650100 650012 650008 650068 650058 650044
0.023437500 0.023437500 0.023437500 0.022460938 0.022460938 0.022460938
650040 650028 650005 650096 650080 650023
0.022460938 0.022460938 0.022460938 0.021484375 0.021484375 0.021484375
650098 650076 650071 650093 650083 650052
0.020507812 0.020507812 0.020507812 0.019531250 0.019531250 0.019531250
650041 650019 650013 650007 650086 650078
0.019531250 0.019531250 0.019531250 0.019531250 0.018554688 0.018554688
650073 650021 650009 650062 650061 650106
0.018554688 0.018554688 0.018554688 0.017578125 0.017578125 0.016601562
650103 650060 650059 650038 650037 650035
0.016601562 0.016601562 0.016601562 0.016601562 0.016601562 0.016601562
650030 650064 650063 650034 650031 650022
0.016601562 0.015625000 0.015625000 0.015625000 0.015625000 0.015625000
650045 650029 650017 650101 650043 650027
0.014648438 0.014648438 0.014648438 0.013671875 0.013671875 0.013671875
650010 650081 650025 650095 650042 650036
0.013671875 0.012695312 0.012695312 0.011718750 0.011718750 0.011718750
650006 650047 650014 650088
0.011718750 0.008789062 0.008789062 0.001953125
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.