makeTx2geneFromGtf: Make tx2gene data.frame from a GTF file
In jmw86069/jambio: Analysis and Visualization of Gene Splice Variants and Transcriptome Data

makeTx2geneFromGtf

R Documentation

Make tx2gene data.frame from a GTF file

Description

Make tx2gene data.frame from a GTF file

Usage

makeTx2geneFromGtf(
  GTF,
  geneAttrNames = c("gene_id", "gene_name", "gene_type", "range"),
  txAttrNames = c("transcript_id", "transcript_type", "range"),
  geneFeatureType = "gene",
  txFeatureType = c("transcript", "mRNA"),
  nrows = -1L,
  zcat_command = "zcat",
  verbose = FALSE,
  ...
)

Arguments

`GTF`	`character` file name sent to `data.table::fread()`. When the file ends with ".gz", the `R.utils` package is recommended, otherwise the fallback option is to make a system call to `gzcat` to gunzip the file during the import step. Note this process fails when `gzcat` is not available in the path of the user environment. In general, the `R.utils` package is the best solution.
`geneAttrNames`	`character` recognized attribute names as they appear in column 9 of the GTF file, for gene rows. The defaults include typical entries in Gencode, plus "range" which creates one field with format "chromosome:start-end:strand".
`txAttrNames`	`character` vector of recognized attribute names as they appear in column 9 of the GTF file, for transcript rows. The defaults include typical entries in Gencode, plus "range" which creates one field with format "chromosome:start-end:strand".
`geneFeatureType`	`character` value to match column 3 of the GTF file, used to define gene rows, by default "gene".
`txFeatureType`	`character` value to match column 3 of the GTF file, used to define gene rows, by default "transcript". In some GTF files, "mRNA" is used, so either is accepted by default.
`nrows`	`integer` number of rows to read from the GTF file, by default -1 means all rows are imported. This parameter is useful to check the results of a large GTF file using only a subset portion of the file.
`zcat_command`	`character` name or path to zcat or gzcat executable, only used when input `GTF` is a file with `".gz"` extension, and when R package `R.utils` is not available.
`verbose`	`logical` whether to print verbose output during processing.
`...`	additional arguments are ignored.

Details

Create a transcript-to-gene data.frame from a GTF file, which is required by a number of transcriptome analysis methods such as those in the DEXseq package, and the limma package functions such as diffSplice().

This function also only uses data.table::fread() and does not import the full GTF file using something like Bioconductor GenomicFeatures, simply because the data.table method is markedly faster when importing only the transcript-to-gene relationship. Also, this method allows the import of more annotations than are supported by the typical Bioconductor rtracklayer::import() for GTF data.

This function is intended to help keep all transcript data consistent by using the same GTF file that is also used by other analysis tools, whether those tools be based in R or more likely, outside R in a terminal environment.

For example, the GTF file could be used to:

run STAR sequence alignment then Rsubread::featureCounts() to generate a matrix of read counts per gene, transcript, or exon
generate a transcript FASTA sequence file then run a kmer quantitation tool such as Salmon or Kallisto, then using tximport::tximport() to import results into R for downstream processing.

Value

data.frame with colnames defined by geneAttrNames and txAttrNames.

jmw86069/jambio
Analysis and Visualization of Gene Splice Variants and Transcriptome Data

makeTx2geneFromGtf: Make tx2gene data.frame from a GTF file
In jmw86069/jambio: Analysis and Visualization of Gene Splice Variants and Transcriptome Data

Make tx2gene data.frame from a GTF file

Description

Usage

Arguments

Details

Value

See Also

Related to makeTx2geneFromGtf in jmw86069/jambio...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/jambio Analysis and Visualization of Gene Splice Variants and Transcriptome Data

makeTx2geneFromGtf: Make tx2gene data.frame from a GTF file In jmw86069/jambio: Analysis and Visualization of Gene Splice Variants and Transcriptome Data

Make tx2gene data.frame from a GTF file

Description

Usage

Arguments

Details

Value

See Also

Related to makeTx2geneFromGtf in jmw86069/jambio...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/jambio
Analysis and Visualization of Gene Splice Variants and Transcriptome Data

makeTx2geneFromGtf: Make tx2gene data.frame from a GTF file
In jmw86069/jambio: Analysis and Visualization of Gene Splice Variants and Transcriptome Data