gtf_stats: Get Gene Length and GC: GTF/Fasta

View source: R/functions.R

gtf_statsR Documentation

Get Gene Length and GC: GTF/Fasta

Description

Get gene GC content of 'gene' from data frame ('df') containing gene IDs in a column labeled the value of 'column_id'. 'df' must have a column labeled sequence where the sequence info for exons/or transcripts are and feature start and stop coordinates in columns labled the value of 'start' and 'stop'. This will return a vector of 'gene', length, and GC content as a percentage.

Usage

gtf_stats(i, data, genome)

Arguments

i

A character of a gene id.

data

a dataframe of exons transcrtipts or genes, imported from a GTF file. one line per-feature, data$V1 corresponds to the featur chormosome, data$V4 to start coordinate, data$V5 to stop coordinate, and data$V7 to the sequence strand.

genome

a list of DNAStringSet objects named by their contig (chromosome) name.


Sage-Bionetworks/sageseqr documentation built on June 13, 2024, 2:11 p.m.