prePonder: PONDER workflow: Prepare PONDER analysis
In fursham-h/ponder:

Description Usage Arguments Value Examples

Import transcript annotation file, match chromosome levels and gene IDs and prepare for NMD prediction analysis

prePonder(
  query,
  reference,
  fasta,
  query_format = NULL,
  reference_format = NULL,
  match_chrom = FALSE,
  match_geneIDs = FALSE,
  primary_gene_id = NULL,
  secondary_gene_id = NULL
)

`query`	Mandatory. Path to query GTF/GFF3 transcript annotation file
`reference`	Mandatory. Path to reference GTF/GFF3 transcript annotation file.
`fasta`	Mandatory. BSGenome object (preferred) or path to fasta file
`query_format`	Optional argument to specify the query annotation format ('gtf','gff3'). Mandatory if query contains '.txt' extension filename
`reference_format`	Optional argument to specify the reference annotation format ('gtf','gff3'). Mandatory if reference contains '.txt' extension filename
`match_chrom`	Supplementary feature. If TRUE, program will attempt to match chromosome names of query and reference to fasta genome to ensure consistent naming across input files.
`match_geneIDs`	Supplementary feature to attempt to match gene IDs in query file to reference file. This is key in grouping query transcripts to reference gene families for comparison. Matching is done at three levels with increasing accuracy: 1. Crudely intersecting query coordinates with reference. Invoked by setting match_geneIDs to TRUE 2. Trim ensembl-style gene IDs and attempt matching. Invoked by providing name of gene ID header (typically 'gene_id') from gtf file to primary_gene_id argument 3. Replace query gene ID with a secondary gene ID and attempt matching. Invoked by providing name of secondary gene ID header (for example 'ref_gene_id') from gtf file to secondary_gene_id argument
`primary_gene_id`	See match_geneIDs argument
`secondary_gene_id`	See match_geneIDs argument

S4 object containing dataframes and objects for downstream NMD prediction analysis

1
2
3

library("BSgenome.Mmusculus.UCSC.mm10")
preppedObject = prepNMDer(testQuery, testRef, Mmusculus, match_geneIDs = TRUE)
preppedObject = prepNMDer(testQuery, testRef, Mmusculus, match_geneIDs = TRUE, primary_gene_id = 'gene_id', secondary_gene_id = 'ref_gene_id')