prePonder: PONDER workflow: Prepare PONDER analysis

Description Usage Arguments Value Examples

View source: R/workflow.R

Description

Import transcript annotation file, match chromosome levels and gene IDs and prepare for NMD prediction analysis

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
prePonder(
  query,
  reference,
  fasta,
  query_format = NULL,
  reference_format = NULL,
  match_chrom = FALSE,
  match_geneIDs = FALSE,
  primary_gene_id = NULL,
  secondary_gene_id = NULL
)

Arguments

query

Mandatory. Path to query GTF/GFF3 transcript annotation file

reference

Mandatory. Path to reference GTF/GFF3 transcript annotation file.

fasta

Mandatory. BSGenome object (preferred) or path to fasta file

query_format

Optional argument to specify the query annotation format ('gtf','gff3'). Mandatory if query contains '.txt' extension filename

reference_format

Optional argument to specify the reference annotation format ('gtf','gff3'). Mandatory if reference contains '.txt' extension filename

match_chrom

Supplementary feature. If TRUE, program will attempt to match chromosome names of query and reference to fasta genome to ensure consistent naming across input files.

match_geneIDs

Supplementary feature to attempt to match gene IDs in query file to reference file. This is key in grouping query transcripts to reference gene families for comparison.

Matching is done at three levels with increasing accuracy:

1. Crudely intersecting query coordinates with reference. Invoked by setting match_geneIDs to TRUE

2. Trim ensembl-style gene IDs and attempt matching. Invoked by providing name of gene ID header (typically 'gene_id') from gtf file to primary_gene_id argument

3. Replace query gene ID with a secondary gene ID and attempt matching. Invoked by providing name of secondary gene ID header (for example 'ref_gene_id') from gtf file to secondary_gene_id argument

primary_gene_id

See match_geneIDs argument

secondary_gene_id

See match_geneIDs argument

Value

S4 object containing dataframes and objects for downstream NMD prediction analysis

Examples

1
2
3
library("BSgenome.Mmusculus.UCSC.mm10")
preppedObject = prepNMDer(testQuery, testRef, Mmusculus, match_geneIDs = TRUE)
preppedObject = prepNMDer(testQuery, testRef, Mmusculus, match_geneIDs = TRUE, primary_gene_id = 'gene_id', secondary_gene_id = 'ref_gene_id')

fursham-h/ponder documentation built on Dec. 27, 2019, 12:15 a.m.