Rchemcpp-package: Rchemcpp provides tools for comparing chemical compounds

Description Details Author(s) References See Also Examples

Description

Compares sets of chemical compounds given as SD/SDF/MOL- or KCF-files and returns pairwise similarities as a matrix (gram matrix). It uses the compiled-in c++ library "chemcpp" to emulate the five chemcpp tools "sd2gram", "sd2gram3Dspectrum", "sd2gramSubtree", "sd2gram3Dpharma" and "sd2gramSpectrum". The tools are made accessible as R functions.

Details

Package: Rchemcpp
Type: Package
Version: 1.1.1
Date: 2013-07-03
License: GPL2.1

Author(s)

Michael Mahr and Guenter Klambauer

References

(Kashima, 2004) – H. Kashima, K. Tsuda, and A. Inokuchi. Kernels for graphs. In B. Schoelkopf, K. Tsuda, and J.P. Vert, editors, Kernel Methods in Computational Biology, pages 155-170. MIT Press, 2004.

(Mahe, 2005) – P. Mahe, N. Ueda, T. Akutsu, J.-L. Perret, and J.-P. Vert. Graph kernels for molecular structure- activity relationship analysis with support vector machines. J Chem Inf Model, 45(4):939-51, 2005.

(Ralaivola, 2005) – L. Ralaivola, S. J. Swamidass, H. Saigo, and P. Baldi. G raph kernels for chemical informatics. Neural Netw., 18(8):1093-1110, Sep 2005.

(Gaertner, 2003) – T. Gaertner, P. Flach, and S. Wrobel. On graph kernels: hardness results and efficient alternatives. In B. Schoelkopf and M. Warmuth, editors, Proceedings of the Sixteenth Annual Conference on Computational Learning Theory and the Seventh Annual Workshop on Kernel Machines, volume 2777 of Lecture Notes in Computer Science, pages 129-143, Heidelberg, 2003.

(Mahe, 2006a) – P. Mahe and J.-P. Vert. Graph kernels based on tree patterns for molecules. Technical Report, HAL:ccsd-00095488, Ecoles des Mines de Paris, September 2006.

(Mahe, 2006b) – P. Mahe, L. Ralaivola, V. Stoven, and J.-P. Vert. The pharmacophore kernel for virtual screening with support vector machines. Technical Report, HAL:ccsd-00020066, Ecole des Mines de Paris, March 2006.

(Leslie, 2002) – C. Leslie, E. Eskin, and W.S. Noble. The spectrum kernel: a string kernel for SVM protein clas- sification. In Russ B. Altman, A. Keith Dunker, Lawrence Hunter, Kevin Lauerdale, and Teri E. Klein, editors, Proceedings of the Pacific Symposium on Biocomputing 2002, pages 564-575. World Scientific, 2002.

(Ramon, 2003) – J. Ramon and T. Gaertner. Expressivity versus efficiency of graph kernels. In T. Washio and L. De Raedt, editors, Proceedings of the First International Workshop on Mining Graphs, Trees and Sequences, pages 65-74, 2003.

See Also

sd2gram sd2gram3Dpharma sd2gramSpectrum sd2gram3Dspectrum sd2gramSubtree

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
sdfolder <- system.file("extdata",package="Rchemcpp")

sdf <- list.files(sdfolder,full.names=TRUE,pattern="small")
K1 <- sd2gram(sdf)
K2 <- sd2gramSpectrum(sdf)
K3 <- sd2gramSubtree(sdf)


sdf_tiny <- list.files(sdfolder,full.names=TRUE,pattern="tiny")
K3 <- sd2gram3Dspectrum(sdf_tiny)
K4 <- sd2gram3Dpharma(sdf_tiny)

Example output

[1] "reading file"
[1] "reading file done"
[1] "setting morgan labels"
[1] "using moleculeKernel Kashima"
calculating the Kashima gram matrix for 30 x 30 molecules
Using a single thread
calculating 30 x 30
0 / 30
1 / 30
2 / 30
3 / 30
4 / 30
5 / 30
6 / 30
7 / 30
8 / 30
9 / 30
10 / 30
11 / 30
12 / 30
13 / 30
14 / 30
15 / 30
16 / 30
17 / 30
18 / 30
19 / 30
20 / 30
21 / 30
22 / 30
23 / 30
24 / 30
25 / 30
26 / 30
27 / 30
28 / 30
29 / 30

Report:
0 molecules pairs could not be distinguished using the graph kernel
0 of them had a different biological activity
0 of them had unknown biological activity

0 molecules pairs were orthogonal
0 of them had a different biological activity
0 of them had unknown biological activity
[1] "end"
[1] "setting morgan labels  0"
[1] "#### initialization Gram OK"
atom type no 1 ; atomic number =  C 
atom type no 2 ; atomic number =  N 
atom type no 3 ; atomic number =  O 
atom type no 4 ; atomic number =  F 
atom type no 5 ; atomic number =  I 
atom type no 6 ; atomic number =  Cl
bond type no 1 ; bond type = 4
bond type no 2 ; bond type = 1
bond type no 3 ; bond type = 2
bond type no 4 ; bond type = 3
 	 finding paths starting from atoms labeled =  C 
 	 finding paths starting from atoms labeled =  N 
 	 finding paths starting from atoms labeled =  O 
 	 finding paths starting from atoms labeled =  F 
 	 finding paths starting from atoms labeled =  I 
 	 finding paths starting from atoms labeled =  Cl
gramComputeSpectrum (self) OK
[1] "gramComputeSpectrum (self) OK"
[1] "normalize gram (self) OK"
[1] "setting morgan labels  0"
[1] "initialization Gram OK"
Subtree-kernel computation:
	- depthMax = 3
	- lambda = 1
	- with-totters

		-molecule no 1/30
		-molecule no 2/30
		-molecule no 3/30
		-molecule no 4/30
		-molecule no 5/30
		-molecule no 6/30
		-molecule no 7/30
		-molecule no 8/30
		-molecule no 9/30
		-molecule no 10/30
		-molecule no 11/30
		-molecule no 12/30
		-molecule no 13/30
		-molecule no 14/30
		-molecule no 15/30
		-molecule no 16/30
		-molecule no 17/30
		-molecule no 18/30
		-molecule no 19/30
		-molecule no 20/30
		-molecule no 21/30
		-molecule no 22/30
		-molecule no 23/30
		-molecule no 24/30
		-molecule no 25/30
		-molecule no 26/30
		-molecule no 27/30
		-molecule no 28/30
		-molecule no 29/30
		-molecule no 30/30
[1] "compute gram OK"
[1] "normalize gram (self) OK"
[1] "setting morgan labels  0"
[1] "#### initialization Gram OK"
atom type no 1 ; atomic number =  C 
atom type no 2 ; atomic number =  N 
atom type no 3 ; atomic number =  O 
atom type no 4 ; atomic number =  F 
 - distMin = 0
 - distMax = 20
 - nBins = 20
   --> binSize = 1.0001
 	 finding paths starting from atoms labeled =  C 
 	 finding paths starting from atoms labeled =  N 
 	 finding paths starting from atoms labeled =  O 
 	 finding paths starting from atoms labeled =  F 
gramComputeSpectrum (self) OK
[1] "gramComputeSpectrum (self) OK"
[1] "normalize gram (self) OK"
[1] "setting morgan labels  0"
[1] "setting morgan labels OK"
calculating 5 x 5
 i = 1 / 5
 i = 2 / 5
 i = 3 / 5
 i = 4 / 5
 i = 5 / 5
[1] "gram matrix computation OK"

Rchemcpp documentation built on May 6, 2019, 4:58 a.m.