fp2bit: Convert base 64 fingerprints to binary

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/sim.R

Description

The function converts the base 64 encoded PubChem fingerprints to a binary matrix or a character vector. If applied to a SDFset object, then its data block needs to contain the PubChem fingerprint information.

Usage

1
fp2bit(x, type = 3, fptag = "PUBCHEM_CACTVS_SUBSKEYS")

Arguments

x

Object of class SDFset, matrix or character

type

If set to 1, the results are returned as binary matrix. If set to 2, the results are returned as character strings in a named vector. If set to 3 (default), the results are returned as FPset object.

fptag

Name tag in SDF data block where the PubChem fingerprints are stored. Default is set to "PUBCHEM_CACTVS_SUBSKEYS".

Details

...

Value

matrix, character or FPset

Author(s)

Thomas Girke

References

See PubChem fingerprint specification at: ftp://ftp.ncbi.nih.gov/pubchem/specifications/pubchem_fingerprints.txt

See Also

Functions: fpSim

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## Load PubChem SDFset sample
data(sdfsample); sdfset <- sdfsample
cid(sdfset) <- sdfid(sdfset)

## Convert base 64 encoded fingerprints to FPset object
fpset <- fp2bit(sdfset)

## Pairwise compound structure comparisons
fpSim(fpset[1], fpset[2]) 

## Structure similarity searching: x is query and y is fingerprint database
fpSim(x=fpset[1], y=fpset, method="Tanimoto", cutoff=0, top="all") 

## Compute fingerprint based Tanimoto similarity matrix 
simMA <- sapply(cid(fpset), function(x) fpSim(x=fpset[x], fpset, sorted=FALSE)) 

## Hierarchical clustering with simMA as input
hc <- hclust(as.dist(1-simMA), method="single")

## Plot hierarchical clustering tree
plot(as.dendrogram(hc), edgePar=list(col=4, lwd=2), horiz=TRUE)

Example output

   650002 
0.5364807 
   650001    650034    650104    650092    650004    650075    650011    650046 
1.0000000 0.6950000 0.6763485 0.6666667 0.6448980 0.6422018 0.6401674 0.6288210 
   650099    650091    650073    650096    650019    650054    650012    650071 
0.6250000 0.6244541 0.6239316 0.6093750 0.6064257 0.6060606 0.5966387 0.5951220 
   650066    650065    650100    650080    650082    650041    650021    650068 
0.5914894 0.5889831 0.5875486 0.5875000 0.5852713 0.5727700 0.5701357 0.5696203 
   650013    650044    650102    650089    650094    650024    650097    650070 
0.5622490 0.5622318 0.5614035 0.5605381 0.5584416 0.5576037 0.5564516 0.5555556 
   650032    650077    650033    650039    650040    650016    650002    650017 
0.5502183 0.5482625 0.5454545 0.5436508 0.5416667 0.5403226 0.5364807 0.5277778 
   650022    650023    650009    650048    650015    650079    650026    650056 
0.5253456 0.5240175 0.5167464 0.5150215 0.5146444 0.5125000 0.5110294 0.5109170 
   650093    650007    650003    650006    650062    650037    650061    650076 
0.5066667 0.5043103 0.5041322 0.5000000 0.5000000 0.4977376 0.4976526 0.4961240 
   650105    650060    650078    650059    650072    650030    650069    650014 
0.4957627 0.4954955 0.4954955 0.4932735 0.4932127 0.4924242 0.4860558 0.4854369 
   650098    650029    650049    650090    650050    650086    650074    650081 
0.4793388 0.4766355 0.4736842 0.4730769 0.4728033 0.4721030 0.4703390 0.4698276 
   650045    650035    650064    650063    650027    650028    650010    650020 
0.4672131 0.4647303 0.4615385 0.4595745 0.4574899 0.4541485 0.4489796 0.4403670 
   650067    650083    650101    650058    650103    650087    650008    650038 
0.4401544 0.4372093 0.4251012 0.4192140 0.4190871 0.4142259 0.4090909 0.4043478 
   650025    650042    650085    650047    650106    650095    650036    650005 
0.4040816 0.4027149 0.4016393 0.4008621 0.3876652 0.3852814 0.3686441 0.3271028 
   650031    650052    650088    650043 
0.2723005 0.2638889 0.2477876 0.2056075 

ChemmineR documentation built on Feb. 28, 2021, 2:02 a.m.