tr2g_EnsDb | R Documentation |
Bioconductor provides Ensembl genome annotation in AnnotationHub
; older
versions of Ensembl annotation can be obtained from packages like
EnsDb.Hsapiens.v86
. This is an alternative to querying Ensembl with
biomart; Ensembl's server seems to be less stable than that of Bioconductor.
However, more information and species are available on Ensembl biomart than
on AnnotationHub
.
tr2g_EnsDb(
ensdb,
Genome = NULL,
get_transcriptome = TRUE,
out_path = ".",
write_tr2g = TRUE,
other_attrs = NULL,
use_gene_name = TRUE,
use_transcript_version = TRUE,
use_gene_version = TRUE,
transcript_biotype_col = "TXBIOTYPE",
gene_biotype_col = "GENEBIOTYPE",
transcript_biotype_use = "all",
gene_biotype_use = "all",
chrs_only = TRUE,
compress_fa = FALSE,
overwrite = FALSE
)
ensdb |
Ann |
Genome |
Either a |
get_transcriptome |
Logical, whether to extract transcriptome from
genome with the GTF file. If filtering biotypes or chromosomes, the filtered
|
out_path |
Directory to save the outputs written to disk. If this directory does not exist, then it will be created. Defaults to the current working directory. |
write_tr2g |
Logical, whether to write tr2g to disk. If |
other_attrs |
Character vector. Other attributes to get from the |
use_gene_name |
Logical, whether to get gene names. |
use_transcript_version |
Logical, whether to include version number in
the Ensembl transcript ID. To decide whether to
include transcript version number, check whether version numbers are included
in the |
use_gene_version |
Logical, whether to include version number in the Ensembl gene ID. Unlike transcript version number, it's up to you whether to include gene version number. |
transcript_biotype_col |
Character vector of length 1. Tag in
|
gene_biotype_col |
Character vector of length 1. Tag in |
transcript_biotype_use |
Character, can be "all" or
a vector of transcript biotypes to be used. Transcript biotypes aren't
entirely the same as gene biotypes. For instance, in Ensembl annotation,
|
gene_biotype_use |
Character, can be "all", "cellranger", or
a vector of gene biotypes to be used. If "cellranger", then the biotypes
used by Cell Ranger's reference are used. See |
chrs_only |
Logical, whether to include chromosomes only, for GTF and
GFF files can contain annotations for scaffolds, which are not incorporated
into chromosomes. This will also exclude haplotypes. Defaults to |
compress_fa |
Logical, whether to compress the output fasta file. If
|
overwrite |
Logical, whether to overwrite if files with names of outputs written to disk already exist. |
A data frame with at least 2 columns: gene
for gene ID,
transcript
for transcript ID, and optionally gene_name
for gene names. If other_attrs
has been specified, then those will
also be columns in the data frame returned.
ensembl_gene_biotypes ensembl_tx_biotypes cellranger_biotypes
Other functions to retrieve transcript and gene info:
sort_tr2g()
,
tr2g_TxDb()
,
tr2g_ensembl()
,
tr2g_fasta()
,
tr2g_gff3()
,
tr2g_gtf()
,
transcript2gene()
library(EnsDb.Hsapiens.v86)
tr2g_EnsDb(EnsDb.Hsapiens.v86, get_transcriptome = FALSE, write_tr2g = FALSE,
use_transcript_version = FALSE,
use_gene_version = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.