idconv
is an R package of wrapper functions to convert gene identifiers available from the org.Hs.eg.db
or org.Mm.eg.db
bioconductor packages.
# install bioconductor dependencies
source("http://bioconductor.org/biocLite.R")
if(!require(AnnotationDbi)) biocLite("AnnotationDbi")
if(!require(org.Hs.eg.db)) biocLite("org.Hs.eg.db")
if(!require(org.Mm.eg.db)) biocLite("org.Mm.eg.db")
# install idconv
devtools::install_github("uhlitz/idconv")
You can choose from the following predefined functions:
SYMBOL_to_ENTREZID()
, ENTREZID_to_SYMBOL
, SYMBOL_to_ENSEMBL
, ENSEMBL_to_SYMBOL
, SYMBOL_to_REFSEQ
, REFSEQ_to_SYMBOL
.
library(idconv)
SYMBOL_to_ENTREZID(c("EGR1", "FOS"))
## 'select()' returned 1:1 mapping between keys and columns
## EGR1 FOS
## "1958" "2353"
ENTREZID_to_SYMBOL(c("1958", "2353"))
## 'select()' returned 1:1 mapping between keys and columns
## 1958 2353
## "EGR1" "FOS"
SYMBOL_to_ENSEMBL(c("EGR1", "FOS"))
## 'select()' returned 1:1 mapping between keys and columns
## EGR1 FOS
## "ENSG00000120738" "ENSG00000170345"
ENSEMBL_to_SYMBOL(c("ENSG00000120738", "ENSG00000170345"))
## 'select()' returned 1:1 mapping between keys and columns
## ENSG00000120738 ENSG00000170345
## "EGR1" "FOS"
You can use the generic function to map other identifiers:
IDX_to_IDY(ids = "NM_005252", from = "REFSEQ", to = "ENSEMBL")
## 'select()' returned 1:1 mapping between keys and columns
## NM_005252
## "ENSG00000170345"
Or simply define your own custom conversion function:
REFSEQ_to_ENSEMBL <- function(ids) IDX_to_IDY(ids = ids, from = "REFSEQ", to = "ENSEMBL")
REFSEQ_to_ENSEMBL("NM_005252")
## 'select()' returned 1:1 mapping between keys and columns
## NM_005252
## "ENSG00000170345"
If more than one mapping is available, AnnotationDbi
returns a warning (1:many mappings) and idconv
wrapper functions return concatenated target IDs by default. If desired, wrapper functions in this package can be forced to return unique mappings. Forcing unique mappings is however not recommended.
SYMBOL_to_ENSEMBL("IER3", force_unique = F) # default
## 'select()' returned 1:many mapping between keys and columns
## IER3
## "ENSG00000137331;ENSG00000237155;ENSG00000235030;ENSG00000227231;ENSG00000230128;ENSG00000206478"
SYMBOL_to_ENSEMBL("IER3", force_unique = NA)
## 'select()' returned 1:many mapping between keys and columns
## IER3
## NA
SYMBOL_to_ENSEMBL("IER3", force_unique = T)
## 'select()' returned 1:many mapping between keys and columns
## IER3
## "ENSG00000137331"
When converting to RefSeq, a subtype for target IDs can be specified, eg. NM
or NP
:
SYMBOL_to_REFSEQ("IER3", to_sub = "NM")
## 'select()' returned 1:many mapping between keys and columns
## IER3
## "NM_003897;NM_052815"
SYMBOL_to_REFSEQ("IER3", to_sub = "NP")
## 'select()' returned 1:many mapping between keys and columns
## IER3
## "NP_003888"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.