tr2g_TxDb: Get transcript and gene info from TxDb objects

Description Usage Arguments Value See Also Examples

View source: R/tr2g.R

Description

The genome and gene annotations of some species can be conveniently obtained from Bioconductor packages. This is more convenient than downloading GTF files from Ensembl and reading it into R. In these packages, the gene annotation is stored in a TxDb object, which has standardized names for gene IDs, transcript IDs, exon IDs, and so on, which are stored in the metadata fields in GTF and GFF3 files, which are not standardized. This function extracts transcript and corresponding gene information from gene annotation stored in a TxDb object.

Usage

1
tr2g_TxDb(txdb)

Arguments

txdb

A TxDb object with gene annotation.

Value

A data frame with 3 columns: gene for gene ID, transcript for transcript ID, and tx_id for internal transcript IDs used to avoid duplicate transcript names. For TxDb packages from Bioconductor, gene ID is Entrez ID, while transcript IDs are Ensembl IDs with version numbers for TxDb.Hsapiens.UCSC.hg38.knownGene. In some cases, the transcript ID have duplicates, and this is resolved by adding numbers to make the IDs unique.

A data frame with 3 columns: gene for gene ID, transcript for transcript ID, and gene_name for gene names. If other_attrs has been specified, then those will also be columns in the data frame returned.

See Also

Other functions to retrieve transcript and gene info: sort_tr2g, tr2g_EnsDb, tr2g_ensembl, tr2g_fasta, tr2g_gff3, tr2g_gtf, transcript2gene

Other functions to retrieve transcript and gene info: sort_tr2g, tr2g_EnsDb, tr2g_ensembl, tr2g_fasta, tr2g_gff3, tr2g_gtf, transcript2gene

Examples

1
2
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
tr2g_TxDb(TxDb.Hsapiens.UCSC.hg38.knownGene)

sarangian/deaRscripts documentation built on Dec. 12, 2019, 12:48 a.m.