get_graph_structure_node_feature: Graph structure and node features from SMILES strings

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/util.R

Description

In molecular graph representations, nodes represent atoms and edges represent bonds. For molecular features, the Chemistry Development Kit (CDK) is used as a cheminformatics tool. The degree of an atom in the graph representation and the atomic symbol and implicit hydrogen count for an atom are used as molecular features.

Usage

1
2
3
4
5
get_graph_structure_node_feature(smiles, max_atoms,
    element_list = c(
        "C", "N", "O", "S", "F", "Si", "P", "Cl",
        "Br", "Mg", "Na", "Ca", "Fe",  "Al", "I",
        "B", "K", "Se", "Zn", "H", "Cu", "Mn"))

Arguments

smiles

SMILES strings

max_atoms

maximum number of atoms

element_list

list of atom symbols

Value

A_pad

a padded or turncated adjacency matrix for each SMILES string

X_pad

a padded or turncated node features for each SMILES string

feature_dim

dimension of node features

element_list

list of atom symbols

Author(s)

Dongmin Jung

References

Balakin, K. V. (2009). Pharmaceutical data mining: approaches and applications for drug discovery. Wiley.

See Also

matlab::padarray, purrr::chuck, rcdk::get.adjacency.matrix, rcdk::get.atoms, rcdk::get.hydrogen.count, rcdk::get.symbol rcdk::parse.smiles

Examples

1

dongminjung/DeepPINCS documentation built on Dec. 20, 2021, 12:13 a.m.