rs.makeDB | R Documentation |
Reads and parses input text file containing reaction smiles into reaction database object. The reaction database is used for querying reaction similarity of candidate reactions.
rs.makeDB (txtFile, header = FALSE, sep = '\t', standardize = TRUE, explicitH = FALSE,
fp.type = 'extended', fp.mode = 'bit', fp.depth = 6, fp.size = 1024,
useMask = FALSE, maskStructure, mask, recursive = FALSE)
txtFile |
input file containing EC numbers, reaction name and RSMI. See description for format of input file. |
header |
boolean to indicate if the input file contains a header. It is set to |
sep |
the field separator character to be used while reading the input file. |
standardize |
suppresses all explicit hydrogen if set as |
explicitH |
converts all implicit hydrogen to explicit if set as |
fp.type |
Fingerprint type to use. Allowed types include: |
fp.mode |
fingerprint mode to be used. It can either be set to |
fp.depth |
search depth for fingerprint construction. This argument is ignored for |
fp.size |
length of the fingerprint bit string. This argument is ignored for |
useMask |
boolean to indicate use of masking. If |
maskStructure |
SMILES or SMARTS of the structure to be searched and masked. |
mask |
SMILES of structure to be used as mask. |
recursive |
if |
The parameters used to generate fingerprints are stored in the database object and returned with the parsed data. Same parameter values are used while parsing input reaction in rs.compute.DB
.
The input text file should contain following three fields, separated with TAB
(or any appropriate field separator). A field can be left blank.
[EC Number] | [Reaction Name] | [Reaction SMILES (RSMI)] |
The package comes with a sample reaction database file extracted from Rhea database (Morgat et al., 2015). If no textfile
is provided, default sample database file is used:
rs.makeDB()
A larger dataset containing all reactions from Rhea database (v.83) is also provided with the package.
Returns a list, containing parsed input data, reaction fingerprints.
Data |
data frame containing EC Numbers, Reaction Names and RSMI as read from the input file. MaskedRSMI are also included if masking is used. |
FP |
list of molecular fingerprints for each reaction in the input file. These fingerprints are further processed based on the reaction similarity algorithm. |
It also contains the parameter values used for generating fingerprints, viz., standardize
, explicitH
, fp.type
, fp.mode
, fp.depth
and fp.size
.
Varun Giri varungiri@gmail.com
Morgat, A., Lombardot, T., Axelsen, K., Aimo, L., Niknejad, A., Hyka-Nouspikel, N., Coudert, E., Pozzato, M., Pagni, M., Moretti, S., Rosanoff, S., Onwubiko, J., Bougueleret, L., Xenarios, I., Redaschi, N., Bridge, A. (2017) Updates in Rhea - an expert curated resource of biochemical reactions. Nucleic Acids Research, 45:D415-D418; doi: 10.1093/nar/gkw990
rs.compute.DB
, rs.mask
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.