| rs.makeDB | R Documentation |
Reads and parses input text file containing reaction smiles into reaction database object. The reaction database is used for querying reaction similarity of candidate reactions.
rs.makeDB (txtFile, header = FALSE, sep = '\t', standardize = TRUE, explicitH = FALSE,
fp.type = 'extended', fp.mode = 'bit', fp.depth = 6, fp.size = 1024,
useMask = FALSE, maskStructure, mask, recursive = FALSE)
txtFile |
input file containing EC numbers, reaction name and RSMI. See description for format of input file. |
header |
boolean to indicate if the input file contains a header. It is set to |
sep |
the field separator character to be used while reading the input file. |
standardize |
suppresses all explicit hydrogen if set as |
explicitH |
converts all implicit hydrogen to explicit if set as |
fp.type |
Fingerprint type to use. Allowed types include: |
fp.mode |
fingerprint mode to be used. It can either be set to |
fp.depth |
search depth for fingerprint construction. This argument is ignored for |
fp.size |
length of the fingerprint bit string. This argument is ignored for |
useMask |
boolean to indicate use of masking. If |
maskStructure |
SMILES or SMARTS of the structure to be searched and masked. |
mask |
SMILES of structure to be used as mask. |
recursive |
if |
The parameters used to generate fingerprints are stored in the database object and returned with the parsed data. Same parameter values are used while parsing input reaction in rs.compute.DB.
The input text file should contain following three fields, separated with TAB (or any appropriate field separator). A field can be left blank.
| [EC Number] | [Reaction Name] | [Reaction SMILES (RSMI)] |
The package comes with a sample reaction database file extracted from Rhea database (Morgat et al., 2015). If no textfile is provided, default sample database file is used:
rs.makeDB()
A larger dataset containing all reactions from Rhea database (v.83) is also provided with the package.
Returns a list, containing parsed input data, reaction fingerprints.
Data |
data frame containing EC Numbers, Reaction Names and RSMI as read from the input file. MaskedRSMI are also included if masking is used. |
FP |
list of molecular fingerprints for each reaction in the input file. These fingerprints are further processed based on the reaction similarity algorithm. |
It also contains the parameter values used for generating fingerprints, viz., standardize, explicitH, fp.type, fp.mode, fp.depth and fp.size.
Varun Giri varungiri@gmail.com
Morgat, A., Lombardot, T., Axelsen, K., Aimo, L., Niknejad, A., Hyka-Nouspikel, N., Coudert, E., Pozzato, M., Pagni, M., Moretti, S., Rosanoff, S., Onwubiko, J., Bougueleret, L., Xenarios, I., Redaschi, N., Bridge, A. (2017) Updates in Rhea - an expert curated resource of biochemical reactions. Nucleic Acids Research, 45:D415-D418; doi: 10.1093/nar/gkw990
rs.compute.DB, rs.mask
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.