Convert base 64 fingerprints to binary

Share:

Description

The function converts the base 64 encoded PubChem fingerprints to a binary matrix or a character vector. If applied to a SDFset object, then its data block needs to contain the PubChem fingerprint information.

Usage

1
fp2bit(x, type = 3, fptag = "PUBCHEM_CACTVS_SUBSKEYS")

Arguments

x

Object of class SDFset, matrix or character

type

If set to 1, the results are returned as binary matrix. If set to 2, the results are returned as character strings in a named vector. If set to 3 (default), the results are returned as FPset object.

fptag

Name tag in SDF data block where the PubChem fingerprints are stored. Default is set to "PUBCHEM_CACTVS_SUBSKEYS".

Details

...

Value

matrix, character or FPset

Author(s)

Thomas Girke

References

See PubChem fingerprint specification at: ftp://ftp.ncbi.nih.gov/pubchem/specifications/pubchem_fingerprints.txt

See Also

Functions: fpSim

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## Load PubChem SDFset sample
data(sdfsample); sdfset <- sdfsample
cid(sdfset) <- sdfid(sdfset)

## Convert base 64 encoded fingerprints to FPset object
fpset <- fp2bit(sdfset)

## Pairwise compound structure comparisons
fpSim(fpset[1], fpset[2]) 

## Structure similarity searching: x is query and y is fingerprint database
fpSim(x=fpset[1], y=fpset, method="Tanimoto", cutoff=0, top="all") 

## Compute fingerprint based Tanimoto similarity matrix 
simMA <- sapply(cid(fpset), function(x) fpSim(x=fpset[x], fpset, sorted=FALSE)) 

## Hierarchical clustering with simMA as input
hc <- hclust(as.dist(1-simMA), method="single")

## Plot hierarchical clustering tree
plot(as.dendrogram(hc), edgePar=list(col=4, lwd=2), horiz=TRUE)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.