easyRNASeq-package: Count summarization and normalization pipeline for Next...

Description Methods Author(s) See Also Examples

Description

Offers functionalities to summarize read counts per feature of interest, e.g. exons, transcripts, genes, etc. Offers functionalities to normalize the summarized counts using 3rd party packages like DESeq or edgeR.

Methods

The main function easyRNASeq will summarize the counts per feature of interest, for as many samples as provided and will return a count matrix (N*M) where N are the features and M the samples. This data can be corrected to RPKM in which case a matrix of corrected value is returned instead, with the same dimensions. Alternatively a RangedSummarizedExperiment can be returned and this is expected to be the default in the upcoming version of easyRNASeq (as of 1.5.x). If the necessary sample information are provided, the data can be normalized using either DESeq or edgeR and the corresponding package object returned. For more insider details, and step by step functions, see:

ShortRead methods for pre-processing the data. easyRNASeq annotation methods for getting the annotation. easyRNASeq coverage methods for computing the coverage from a Short Read Alignment file. easyRNASeq summarization methods for summarizing the data. easyRNASeq correction methods for correcting the data (i.e. generating RPKM). edgeR methods for post-processing the data. DESeq methods for post-processing the data.

Author(s)

Nicolas Delhomme, Bastian Schiffthaler, Ismael Padioleau

See Also

The class RNAseq specification: RNAseq

The default output class specification: RangedSummarizedExperiment

The imported packages: biomaRt BiocParallel edgeR genomeIntervals Biostrings BSgenome DESeq GenomicRanges IRanges Rsamtools ShortRead

The suggested packages: parallel GenomicFeatures

The following classes and functions that are made available from other packages:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 # the data
 tdir <- tutorialData()

  # get the example annotation file - we retrieve a gtf file from GitHub
  annot <- fetchData("Drosophila_melanogaster.BDGP5.77.with-chr.gtf.gz")

  # create the AnnotParam
  annotParam <- AnnotParam(
    datasource=annot,
    type="gtf")

  # create the synthetic transcripts
  annotParam <- createSyntheticTranscripts(annotParam,verbose=FALSE)

 # create the RnaSeqParam
 rnaSeqParam <- RnaSeqParam(annotParam=annotParam,countBy="gene")

  # get the bamfiles (from the Bioc cache in this example)
 filenames <- dir(tdir,pattern="[A,T].*\\.bam$",full.names=TRUE)
 indexnames <- sapply(paste0(sub(".*_","",basename(filenames)),".bai"),fetchData)
 bamFiles <- getBamFileList(filenames,indexnames)

 # get a RangedSummarizedExperiment containing the counts table
  sexp <- simpleRNASeq(
      bamFiles=bamFiles,
      param=rnaSeqParam,
      verbose=TRUE
  )

  # get the counts
  assays(sexp)$genes

easyRNASeq documentation built on April 30, 2020, 2 a.m.