pkg <- read.dcf("DESCRIPTION", fields = "Package")[1]
title <- read.dcf("DESCRIPTION", fields = "Title")[1]
description <- read.dcf("DESCRIPTION", fields = "Description")[1]
URL <- read.dcf('DESCRIPTION', fields = 'URL')[1]
owner <- tolower(strsplit(URL,"/")[[1]][4])

Intro

r description

In brief, orthogene lets you easily:

Citation

If you use r pkg, please cite:

r citation(pkg)$textVersion

Documentation website

PDF manual

Installation

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
# orthogene is only available on Bioconductor>=3.14
if(BiocManager::version()<"3.14") BiocManager::install(update = TRUE, ask = FALSE)

BiocManager::install("orthogene")

Docker

orthogene can also be installed via a Docker or Singularity container with Rstudio pre-installed. Further instructions provided here.

Methods

library(orthogene)

data("exp_mouse")
# Setting to "homologene" for the purposes of quick demonstration.
# We generally recommend using method="gprofiler" (default).
method <- "homologene"  

For most functions, orthogene lets users choose between different methods, each with complementary strengths and weaknesses: "gprofiler", "homologene", and "babelgene"

In general, we recommend you use "gprofiler" when possible, as it tends to be more comprehensive.

While "babelgene" contains less species, it queries a wide variety of orthology databases and can return a column "support_n" that tells you how many databases support each ortholog gene mapping. This can be helpful when you need a semi-quantitative measure of mapping quality.

It's also worth noting that for smaller gene sets, the speed difference between these methods becomes negligible.

pros_cons <- data.frame(
    gprofiler=c("Reference organisms"="700+",
                "Gene mappings"="More comprehensive",
                "Updates"="Frequent", 
                "Orthology databases"=paste("Ensembl",
                                            "HomoloGene",
                                            "WormBase",sep = ", "),
                "Data location"="Remote",
                "Internet connection"="Required",
                "Speed"="Slower"),

   homologene=c("# reference organisms"="20+", 
                "Gene mappings"="Less comprehensive",
                "Updates"="Less frequent",
                "Orthology databases"="HomoloGene",
                "Data location"="Local",
                "Internet connection"="Not required",
                "Speed"="Faster"),

    babelgene=c("# reference organisms"="19 (but cannot convert between pairs of non-human species)", 
                "Gene mappings"="More comprehensive",
                "Updates"="Less frequent",
                "Orthology databases"="HGNC Comparison of Orthology Predictions (HCOP), which includes predictions from eggNOG, Ensembl Compara, HGNC, HomoloGene, Inparanoid, NCBI Gene Orthology, OMA, OrthoDB, OrthoMCL, Panther, PhylomeDB, TreeFam and ZFIN",
                "Data location"="Local",
                "Internet connection"="Not required",
                "Speed"="Medium")
           )
knitr::kable(pros_cons)

Quick example

Convert orthologs

convert_orthologs is very flexible with what users can supply as gene_df, and can take a data.frame/data.table/tibble, (sparse) matrix, or list/vector containing genes.

Genes, transcripts, proteins, SNPs, or genomic ranges will be recognised in most formats (HGNC, Ensembl, RefSeq, UniProt, etc.) and can even be a mixture of different formats.

All genes will be mapped to gene symbols, unless specified otherwise with the ... arguments (see ?orthogene::convert_orthologs or here for details).

Note on non-1:1 orthologs

A key feature of convert_orthologs is that it handles the issue of genes with many-to-many mappings across species. This can occur due to evolutionary divergence, and the function of these genes tend to be less conserved and less translatable. Users can address this using different strategies via non121_strategy=.

gene_df <- orthogene::convert_orthologs(gene_df = exp_mouse,
                                        gene_input = "rownames", 
                                        gene_output = "rownames", 
                                        input_species = "mouse",
                                        output_species = "human",
                                        non121_strategy = "drop_both_species",
                                        method = method) 

knitr::kable(as.matrix(head(gene_df)))

convert_orthologs is just one of the many useful functions in orthogene. Please see the documentation website for the full vignette.

Additional resources

Hex sticker creation

Benchmarking methods

Session Info

utils::sessionInfo()

Related projects

Tools

Databases

Contact

Neurogenomics Lab

UK Dementia Research Institute
Department of Brain Sciences
Faculty of Medicine
Imperial College London
GitHub
DockerHub




neurogenomics/orthogene documentation built on Jan. 30, 2024, 4:44 a.m.