sd2gramSubtree: sd2gramSubtree - Similarity of molecules by several graph...

Description Usage Arguments Value Author(s) References Examples

View source: R/sd2gramSubtree.R

Description

This tools computes several graph kernels based on the detection of common subtrees: the so-called tree-pattern graph kernels, originally introduced in (Ramon, 2003), and revisited in (Mahe, 2006).

Usage

1
2
3
4
5
6
7
  sd2gramSubtree(sdf, sdf2,
    kernelType = c("sizebased", "branchingbased"),
    branchKernelUntilN = FALSE, lambda = 1,
    depthMax = as.integer(3), flagRemoveH = FALSE,
    filterTottering = FALSE, morganOrder = as.integer(0),
    silentMode = FALSE, returnNormalized = TRUE,
    detectArom = FALSE)

Arguments

sdf

File containing the molecules. Must be in MDL file format (MOL and SDF files). For more information on the file format see http://en.wikipedia.org/wiki/Chemical_table_file.

sdf2

A second file containing molecules. Must also be in SDF. If specified the molecules of the first file will be compared with the molecules of this second file. Default = "missing".

kernelType

Determines whether subtrees of the molecule are penalized size-based or branching-based. Default = "sizebased".

branchKernelUntilN

Logical whether tree patterns of until N should be considered. Default = FALSE.

lambda

Weighted contribution of tree-patterns depending on their sizes Default = 1.

depthMax

tree-patterns of depth. Default = 3.

flagRemoveH

A logical that indicates whether H-atoms should be removed or not. Default = FALSE.

filterTottering

A logical that indicates whether tottering walks should be removed. Default = FALSE.

morganOrder

The order of the DeMorgan indices to be used. If set to zero no DeMorgan indices are used. The higher the order the more different types of atoms exist and consequently the more dissimilar will be the molecules. Default = 0.

silentMode

Whether the program should print progress reports to the standart output. Default = FALSE.

returnNormalized

A logical specifying whether a normalized kernel matrix should be returned. Default = TRUE.

detectArom

Whether aromatic rings should be detected and aromatic bonds should a special bond type. If large molecules are in the data set the detection of aromatic rings can be very time-consuming. (Default = FALSE).

Value

A numeric matrix containing the similarity values between the molecules.

Author(s)

Michael Mahr <rchemcpp@bioinf.jku.at> c++ function written by Jean-Luc Perret and Pierre Mahe

References

(Mahe, 2006) – P. Mahe and J.-P. Vert. Graph kernels based on tree patterns for molecules. Technical Report, HAL:ccsd-00095488, Ecoles des Mines de Paris, September 2006. (Ramon, 2003) – J. Ramon and T. Gaertner. Expressivity versus efficiency of graph kernels. In T. Washio and L. De Raedt, editors, Proceedings of the First International Workshop on Mining Graphs, Trees and Sequences, pages 65-74, 2003.

Examples

1
2
3
sdfolder <- system.file("extdata",package="Rchemcpp")
sdf <- list.files(sdfolder,full.names=TRUE,pattern="small")
K <- sd2gramSubtree(sdf)

Rchemcpp documentation built on May 6, 2019, 4:58 a.m.