read.SDFindex: Extract Molecules from SD File by Line Index

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/AllClasses.R

Description

Extracts specific molecules from SD File based on a line position index computed by the sdfStream function.

Usage

1
read.SDFindex(file, index, type = "SDFset", outfile)

Arguments

file

file name of source SD file used to generate index

index

data frame containing in the first two columns the start and end positions (index) of molecules in an SD File, respectively. Typically, this index would be imported with read.table/read.delim from a tabular descriptor file generated by the sdfStream function.

type

if type="file", the SDF output will be written to a file named as specified under outfile; if type="SDFset", the SDF data is collected will be a SDFset container.

outfile

name of output file when type="file"

Details

...

Value

Writes molecules in SDF format to file or collects them in SDFset container.

Author(s)

Thomas Girke

References

SDF format definition: http://www.symyx.com/downloads/public/ctfile/ctfile.jsp

See Also

Import/export functions: read.SDFset, read.SDFstr, read.SDFstr, read.SDFset, write.SDFsplit

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
## Load sample data
library(ChemmineR)
data(sdfsample); sdfset <- sdfsample
## Not run: write.SDF(sdfset, "test.sdf")

## Define descriptor set in a simple function
desc <- function(sdfset) {
        cbind(SDFID=sdfid(sdfset), 
              # datablock2ma(datablocklist=datablock(sdfset)), 
              MW=MW(sdfset), 
              groups(sdfset), 
              # AP=sdf2ap(sdfset, type="character"),
              rings(sdfset, type="count", upper=6, arom=TRUE)
        )
}

## Run sdfStream with desc function and write results to a file called 'matrix.xls'
sdfStream(input="test.sdf", output="matrix.xls", fct=desc, Nlines=1000)

## Select molecules from SD File using line index from sdfStream
indexDF <- read.delim("matrix.xls", row.names=1)[,1:4]
indexDFsub <- indexDF[indexDF$MW < 400, ] # Selects molecules with MW < 400
sdfset <- read.SDFindex(file="test.sdf", index=indexDFsub, type="SDFset")

## Write result directly to SD file without storing larger numbers of molecules in memory
read.SDFindex(file="test.sdf", index=indexDFsub, type="file", outfile="sub.sdf")

## End(Not run)

ChemmineR documentation built on Feb. 28, 2021, 2:02 a.m.