Introduction

This software package is the customization and query interface for the annotation SQLite database from the corresponding compoundCollectionData package. It provides utilities to query the compound annotations from DrugAge, DrugBank, CMAP02, and LINCS resources by providing ChEMBL ids of the query compounds. It also supports adding custom compound annotations to the annotation SQLite database.

Installation and Loading

As Bioconductor package customCompoundDB can be installed with the BiocManager::install() function.

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("customCompoundDB")
BiocManager::install("yduan004/customCompoundDB", build_vignettes=TRUE)  # Installs from github

Next the package needs to be loaded into a user's R session.

library(customCompoundDB)
library(help = "customCompoundDB")  # Lists package info
vignette("customCompoundDB")  # Opens vignette

Annotation Database

The helper package compoundCollectionData provides access to the pre-built SQLite database that are stored on Bioconductor's AnnotationHub. Users can download this database and get its path as follows.

library(AnnotationHub)
ah <- AnnotationHub()
annotdb <- ah[["AH79563"]]

Custom Annotations

Add Custom Annotation

The following shows functions used to add user's customized compound annotation tables to the annotation SQLite database in the compoundCollectionData package. In this case, users need to know the corresponding ChEMBL ids of the added compounds. The added annotation table should contain the chembl_id column.

chembl_id <- c("CHEMBL1000309", "CHEMBL100014", "CHEMBL10",
               "CHEMBL100", "CHEMBL1000", NA)
annot_tb <- data.frame(compound_name=paste0("name", 1:6),
        chembl_id=chembl_id,
        feature1=paste0("f", 1:6),
        feature2=rnorm(6))
addCustomAnnot(annot_tb, annot_name="mycustom")

annot_tb is an R data.frame object representing the custom annotation table, Note, it should contains a column named as chembl_id representing the ChEMBL ids of the added compounds. annot_name is a user defined name of the annotation table.

Delete

The following shows the R code used to delete a custom annotation resource by providing its name.

deleteAnnot("mycustom")

List Existing Annotations

The following function lists the available annotation resources in the SQLite annotation database

listAnnot()

Set to Default

The following function sets the annotation SQLite database to the default one by deleting the existing one and re-downloading from AnnotationHub.

defaultAnnot()

Query Annotation DB

The following function can be used to query compound annotations from the default resources as well as the custom resources stored in the SQLite annotation database. The default annotation resources are DrugAge, DrugBank, CMAP02 and LINCS. Detailed description of this SQLite database is available at the vignette of the compoundCollectionData package. Users customized compound annotations could be added/deleted as described above.

The input of the query function is a set of ChEMBL IDs, it returns a data.frame storing annotations of the input compounds from the selected annotation resources defined by the \code{annot} argument.

query_id <- c("CHEMBL1064", "CHEMBL10", "CHEMBL113", "CHEMBL1004", "CHEMBL31574")
annot_res <- queryAnnotDB(query_id, annot=c("DrugAge", "LINCS"))
annot_res
# query added custom annotation
annot_res2 <- queryAnnotDB(query_id, annot=c("LINCS", "mycustom"))
annot_res2


yduan004/compoundCollection documentation built on Sept. 20, 2020, 5:59 a.m.