transcript2gene | R Documentation |
This function is a shortcut to get the correctly sorted data frame with
transcript IDs and the corresponding gene IDs from Ensembl biomart or Ensembl
transcriptome FASTA files. For biomart query, it calls
tr2g_ensembl
and then sort_tr2g
. For FASTA files,
it calls tr2g_fasta
and then sort_tr2g
. Unlike in
tr2g_ensembl
and tr2g_fasta
, multiple species can
be supplied if cells from different species were sequenced together. This
function should only be used if the kallisto inidex was built with
transcriptomes from Ensembl. Also, if querying biomart, please make sure to set
ensembl_version
to match the version where the transcriptomes were
downloaded.
transcript2gene(
species,
fasta_file,
kallisto_out_path,
type = "vertebrate",
...
)
species |
A character vector of Latin names of species present in this scRNA-seq dataset. This is used to retrieve Ensembl information from biomart. |
fasta_file |
Character vector of paths to the transcriptome FASTA files
used to build the kallisto index. Exactly one of |
kallisto_out_path |
Path to the |
type |
A character vector indicating the type of each species. Each
element must be one of "vertebrate", "metazoa", "plant", "fungus", and
"protist". If length is 1, then this type will be used for all species specified
here. Can be missing if |
... |
Other arguments passed to |
A data frame with two columns: gene
and transcript
,
with Ensembl gene and transcript IDs (with version number), in the same order
as in the transcriptome index used in kallisto
.
This function has been superseded by the new version of tr2g_* functions that can extract transcriptome for only the biotypes specified and with only the standard chromosomes. The new version of tr2g_* functions also sorts the transcriptome so the tr2g and the transcriptome have transcripts in the same order.
Other functions to retrieve transcript and gene info:
sort_tr2g()
,
tr2g_EnsDb()
,
tr2g_TxDb()
,
tr2g_ensembl()
,
tr2g_fasta()
,
tr2g_gff3()
,
tr2g_gtf()
# Download dataset already in BUS format
library(TENxBUSData)
TENxBUSData(".", dataset = "hgmm100")
tr2g <- transcript2gene(c("Homo sapiens", "Mus musculus"),
type = "vertebrate", save_filtered = FALSE,
ensembl_version = 99, kallisto_out_path = "./out_hgmm100")
# Clean up files from the example
unlink("out_hgmm100")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.