In easyRNASeq: Count summarization and normalization for RNA-Seq data

RNA-Seq parameters - RnaSeqParam

The final set of parameters we need to define encapsulate the AnnotParam and BamParam and detail how the read summarization should be performed. simpleRNASeq supports A) 2 modes of counting:

by read
by bp

the latter of which, was the default counting method the easyRNASeq function. Due to the more complex implementation required, the non-evidence of increase in counting accuracy and the extended support of the read-based approach by the mainstream, standardised Bioconductor package has led the read method to be the default in simpleRNASeq. Due to lack of time for maintenance and improvement, the bp-based method is also not recommended.

over B) 4 feature types: exon, transcript, gene or any feature provided by the user. The latter may be for example used for counting reads in promoter regions.

Given a flattened transcript structure - as created in a previous section - summarizing by transcripts or genes is equivalent. Note that using a non flattened annotation with any feature type will result in multiple counting!! i.e. the product of a single mRNA fragment will be counted for every features it overlap, hence introducing a significant bias in downstream analyses.

Given a flattened transcript structure, summarizing by exon enables the use of the resulting count table for processes such as differential exon usage analyses, as implemented in the DEXSeq package.

For the Robinson, Delhomme et al. dataset, we are interested in the gene expression, hence we create our RnaSeqParam object as follows:

rnaSeqParam <- RnaSeqParam(annotParam = annotParam,
                           bamParam = bamParam,
                           countBy = "genes",
                           precision = "read")