ParseMetaFromGtfFile: Helper function to parse length and GC content information...

Description Usage Arguments Author(s) Examples

View source: R/Helper.R

Description

When applying the UPC_RNASeq function, it is possible to correct for the length and GC content of genomic features. To accomplish this, an annotation file indicating these values for each feature must be provided. This helper function enables users to generate an annotation file, using a GTF file and genome FASTA file as references.

Usage

1
2
ParseMetaFromGtfFile(gtfFilePath, fastaFilePattern, outFilePath,
                     featureTypes=c("protein_coding"), attributeType="gene_id")

Arguments

gtfFilePath

Path to the GTF file that will be parsed.

fastaFilePattern

File pattern that indicates where FASTA file(s) for the associated reference genome can be found.

outFilePath

Path where the output file will be stored.

featureTypes

One or more feature types (for example, "protein_coding," "unprocessed_pseudogene") that should be extracted from the GTF file. The default is "protein_coding."

attributeType

The type of attribute ("gene_id", "transcript_id") to be parsed. Values will be grouped according to these attributes.

Author(s)

Stephen R. Piccolo

Examples

1
2
3
4
## Not run: 
ParseMetaFromGtfFile("GRCh37_XY.gtf", "GRCh37.fa", "GRCh37_Annotation.txt")

## End(Not run)

SCAN.UPC documentation built on Nov. 1, 2018, 2:22 a.m.