seq_check: Check SMILES strings and amino acid sequences

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/util.R

Description

In real-world cases, most of the data are not complete and contains incorrect values, missing values, and so on. Thus, there may be invalid sequences in the data. This function can find such sequences and remove them from the data. For SMILES strings, the function "webchem::is.smiles" is used. A valid amino acid sequence means a string that only contains capital letters of an alphabet.

Usage

1
seq_check(smiles = NULL, AAseq = NULL, outcome = NULL)

Arguments

smiles

SMILES strings (default: NULL)

AAseq

amino acid sequences (default: NULL)

outcome

a variable that indicates how strong two molecules interact with each other or whether there is an interaction between them (default: NULL)

Value

valid sequences

Author(s)

Dongmin Jung

References

Dey, N., Wagh, S., Mahalle, P. N., & Pathan, M. S. (Eds.). (2019). Applied machine learning for smart data analysis. CRC Press.

See Also

webchem::is.smiles

Examples

1
seq_check(smiles = example_cpi[1, 1], outcome = example_cpi[1, 3])

dongminjung/DeepPINCS documentation built on Dec. 20, 2021, 12:13 a.m.