Annotation parameters - AnnotParam

The AnnotParam class is meant to store the minimal set of information necessary to retrieve the annotation

The minimal information to provide is:

  1. a datasource: a path to a file provided as a character string or if the type is biomaRt, the datasource you want to connect to, as retrieved using the biomaRt datasource function.
  2. a type: one of "gff3","gtf","rda", and "biomaRt". gff3 is the default. If rda is used, it expects the corresponding file to contain a GRanges object by the name of gAnnot.

In this tutorial, we will reproduce the analysis performed in Robinson, Delhomme et al. [@Robinson:2014p6362]. For that we will start by downloading the original annotation gff3 file for P. trichocarpa, a close related species of the trees used in the study into the current directory.

download.file(url=paste0("ftp://ftp.plantgenie.org/Data/PopGenIE/",
                         "Populus_trichocarpa/v3.0/v10.1/GFF3/",
                         "Ptrichocarpa_210_v3.0_gene_exons.gff3.gz"),
                  destfile=,"./Ptrichocarpa_210_v3.0_gene_exons.gff3.gz")
file.copy(dir(vDir,pattern="*Ptrichocarpa_210_v3.0_gene_exons.gff3.gz",full.names = TRUE),
          "./Ptrichocarpa_210_v3.0_gene_exons.gff3.gz")

Before instantiating an "AnnotParam" object.

    annotParam <- AnnotParam(
        datasource="./Ptrichocarpa_210_v3.0_gene_exons.gff3.gz")

This annotation file however contains multiple copy of the same exons, i.e. when exons are shared by several isoforms of a gene. This might result in so-called "multiple-counting" and as described in these guidelines[^1], we will to circumvent that issue create a set of synthetic transcripts.




Try the easyRNASeq package in your browser

Any scripts or data that you put into this service are public.

easyRNASeq documentation built on April 30, 2020, 2 a.m.