tr2g_GRanges: Get transcript and gene info from GRanges

Description Usage Arguments Value

View source: R/tr2g.R

Description

Internal use, for GRanges from GTF files

Usage

1
2
3
4
tr2g_GRanges(gr, type_use = "exon", transcript_id = "transcript_id",
  gene_id = "gene_id", gene_name = "gene_name",
  transcript_version = "transcript_version",
  gene_version = "gene_version", version_sep = ".")

Arguments

gr

A GRanges object. The metadata columns should be atomic vectors, not lists.

type_use

Character vector, the values taken by the type field in the GTF file that denote the desired transcripts. This can be "exon", "transcript", "mRNA", and etc.

transcript_id

Character vector of length 1. Tag in attribute field corresponding to transcript IDs. This argument must be supplied and cannot be NA or NULL. Will throw error if tag indicated in this argument does not exist.

gene_id

Character vector of length 1. Tag in attribute field corresponding to gene IDs. This argument must be supplied and cannot be NA or NULL. Note that this is different from gene symbols, which do not have to be unique. This can be Ensembl or Entrez IDs. However, if the gene symbols are in fact unique for each gene, you may supply the tag for human readable gene symbols to this argument. Will throw error if tag indicated in this argument does not exist.

gene_name

Character vector of length 1. Tag in attribute field corresponding to gene symbols. This argument can be NA or NULL if you are fine with non-human readable gene IDs and do not wish to extract human readable gene symbols.

transcript_version

Character vector of length 1. Tag in attribute field corresponding to transcript version number. If your GTF file does not include transcript version numbers, or if you do not wish to include the version number, then use NULL for this argument. To decide whether to include transcript version number, check whether version numbers are included in the transcripts.txt in the kallisto output directory. If that file includes version numbers, then trannscript version numbers must be included here as well. If that file does not include version numbers, then transcript version numbers must not be included here.

gene_version

Character vector of length 1. Tag in attribute field corresponding to gene version number. If your GTF file does not include gene version numbers, or if you do not wish to include the version number, then use NULL for this argument. Unlike transcript version number, it's up to you whether to include gene version number.

version_sep

Character to separate bewteen the main ID and the version number. Defaults to ".", as in Ensembl.

Value

A data frame at least 2 columns: gene for gene ID, transcript for transcript ID, and optionally, gene_name for gene names.


sarangian/RNASeqDEA documentation built on Dec. 8, 2019, 5:24 p.m.