fp2bit: Convert base 64 fingerprints to binary

View source: R/sim.R

fp2bitR Documentation

Convert base 64 fingerprints to binary

Description

The function converts the base 64 encoded PubChem fingerprints to a binary matrix or a character vector. If applied to a SDFset object, then its data block needs to contain the PubChem fingerprint information.

Usage

fp2bit(x, type = 3, fptag = "PUBCHEM_CACTVS_SUBSKEYS")

Arguments

x

Object of class SDFset, matrix or character

type

If set to 1, the results are returned as binary matrix. If set to 2, the results are returned as character strings in a named vector. If set to 3 (default), the results are returned as FPset object.

fptag

Name tag in SDF data block where the PubChem fingerprints are stored. Default is set to "PUBCHEM_CACTVS_SUBSKEYS".

Details

...

Value

matrix, character or FPset

Author(s)

Thomas Girke

References

See PubChem fingerprint specification at: ftp://ftp.ncbi.nih.gov/pubchem/specifications/pubchem_fingerprints.txt

See Also

Functions: fpSim

Examples

## Load PubChem SDFset sample
data(sdfsample); sdfset <- sdfsample
cid(sdfset) <- sdfid(sdfset)

## Convert base 64 encoded fingerprints to FPset object
fpset <- fp2bit(sdfset)

## Pairwise compound structure comparisons
fpSim(fpset[1], fpset[2]) 

## Structure similarity searching: x is query and y is fingerprint database
fpSim(x=fpset[1], y=fpset, method="Tanimoto", cutoff=0, top="all") 

## Compute fingerprint based Tanimoto similarity matrix 
simMA <- sapply(cid(fpset), function(x) fpSim(x=fpset[x], fpset, sorted=FALSE)) 

## Hierarchical clustering with simMA as input
hc <- hclust(as.dist(1-simMA), method="single")

## Plot hierarchical clustering tree
plot(as.dendrogram(hc), edgePar=list(col=4, lwd=2), horiz=TRUE)

girke-lab/ChemmineR documentation built on July 28, 2023, 10:36 a.m.