Description Usage Arguments Value Author(s) References Examples
Preprocessing helps make the data suitable for the model depending on the type of data the preprocessing works upon. Preprocessing is more time consuming for text data. The adjacency matrix and node feature, fingerprint, or string data are preprocessed from sequences.
1 2 3 4 5 6 7 8 9 | seq_preprocessing(smiles = NULL,
AAseq = NULL,
type,
convert_canonical_smiles,
max_atoms,
length_seq,
lenc = NULL,
ngram_max = 1,
ngram_min = 1)
|
smiles |
SMILES strings (default: NULL) |
AAseq |
amino acid sequences (default: NULL) |
type |
"graph", "fingerprint" or "sequence" |
convert_canonical_smiles |
SMILES strings are converted to canonical SMILES strings if TRUE |
max_atoms |
maximum number of atoms for compounds |
length_seq |
length of compound or protein sequence |
lenc |
encoded labels for characters of SMILES strings or amino acid sequenes (default: NULL) |
ngram_max |
maximum size of an n-gram for protein sequences (default: 1) |
ngram_min |
minimum size of an n-gram for protein sequences (default: 1) |
canonical_smiles |
canonical representation of SMILES |
convert_canonical_smiles |
canonical representation is used or not |
A_pad |
padded or turncated adjacency matrix of compounds if type is "graph" |
X_pad |
padded or turncated node features of compounds if type is "graph" |
fp |
fingerprint of compounds if type is "fingerprint" |
sequences_encode_pad |
encoded sequences which are padded or truncated |
lenc |
encoded labels for characters of SMILES strings or amino acid sequenes |
length_seq |
length of compound or protein sequence |
num_tokens |
total number of characters of compounds or proteins |
Dongmin Jung
Dey, N., Wagh, S., Mahalle, P. N., & Pathan, M. S. (Eds.). (2019). Applied machine learning for smart data analysis. CRC Press.
1 2 3 | seq_preprocessing(smiles = cbind(example_cpi[1, 1]),
type = "fingerprint",
convert_canonical_smiles = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.