knitr::opts_chunk$set(echo = TRUE) library(knitr) library(badger) library(magrittr) devtools::load_all()
r badge_cran_release('homologene',color = '#32BD36')
r badge_devel("oganm/homologene", "blue")
An r package that works as a wrapper to homologene
Available species are
homologene::taxData
install.packages('homologene')
or
devtools::install_github('oganm/homologene')
Basic homologene function requires a list of gene symbols or NCBI ids, and an inTax
and an outTax
. In this example, inTax
is the taxon id of mus musculus while outTax
is for humans.
homologene(c('Eno2','Mog'), inTax = 10090, outTax = 9606) homologene(c('Eno2','17441'), inTax = 10090, outTax = 9606)
For mouse and humans two convenience functions exist that removes the need to provide taxonomic identifiers. Note that the column names are not the same as the homologene
output.
mouse2human(c('Eno2','Mog')) human2mouse(c('ENO2','MOG','GZMH'))
Original homologene database has not been updated since 2014. This package also includes an updated version of the homologene database that replaces gene symbols and identifiers with the their latest version. For the procedure followed for updating, see this blog post and/or see the processing code.
Using the updated version can help you match genes that cannot matched due to out of date annotations.
mouse2human(c('Mesd', 'Trp53rka', 'Cstdc4', 'Ifit3b')) mouse2human(c('Mesd', 'Trp53rka', 'Cstdc4', 'Ifit3b'), db = homologeneData2)
The homologeneData2
object that comes with the GitHub version of this package
is updated weekly but if you are using the CRAN version and want the latest
annotations, or if you want to keep
a frozen version homologene, you can use the updateHomologene
function.
homologeneDataVeryNew = updateHomologene() # update the homologene database with the latest identifiers mouse2human(c('Mesd', 'Trp53rka', 'Cstdc4', 'Ifit3b'), db = homologeneDataVeryNew)
The package also includes functions that were used to create the homologeneData2
, for updating outdated gene symbols and identifiers.
library(dplyr) gene_history = getGeneHistory() oldIds = c(4340964, 4349034, 4332470, 4334151, 4323831) newIds = updateIDs(oldIds,gene_history) print(newIds) # get the latest gene symbols for the ids gene_info = getGeneInfo() gene_info %>% dplyr::filter(GeneID %in% as.integer(newIds)) # faster to match integers
Instead of using just homologene, one can also make queries into the DIOPT database. Diopt uses multiple databases
to find gene homolog/orthologues. Note that this function has a delay
parameter
that is set to 10 seconds by default. This was done to obey the robots.txt
of their website.
diopt(c('GZMH'),inTax = 9606, outTax = 10090) %>% knitr::kable() diopt(c('Eno2','Mog'),inTax = 10090, outTax =9606) %>% knitr::kable()
As of version version 1.1.68, the output now includes NCBI ids. Since it doesn't change any of the existing column names or their order, this shouldn't cause problems in most use cases.
If a you can't find a gene you are looking for it may have synonyms. See geneSynonym package to find them. If you have other problems open an issue or send a mail.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.