exampleZv9_annot: Example zebrafish annotation data
In charles-plessy/CAGEr: Analysis of CAGE (Cap Analysis of Gene Expression) sequencing data for precise mapping of transcription start sites and promoterome mining

exampleZv9_annot

R Documentation

Example zebrafish annotation data

Description

Annotation data for zebrafish's chromosome 17's interval 26000000-54000000 (Zv9/danRer7 genome), to be used in documentation examples.

Usage

exampleZv9_annot

Format

An object of class GRanges of length 7467.

Details

Data was retreived from ENSEMBL's Biomart server using a query to extract gene, transcripts and exon coordinates. For the record, here it is as URL (long, possibly overflowing).

http://mar2015.archive.ensembl.org/biomart/martview/78d86c1d6b4ef51568ba6d46f7d8b254?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.structure.ensembl_gene_id|drerio_gene_ensembl.default.structure.ensembl_transcript_id|drerio_gene_ensembl.default.structure.start_position|drerio_gene_ensembl.default.structure.end_position|drerio_gene_ensembl.default.structure.transcript_start|drerio_gene_ensembl.default.structure.transcript_end|drerio_gene_ensembl.default.structure.strand|drerio_gene_ensembl.default.structure.chromosome_name|drerio_gene_ensembl.default.structure.external_gene_name|drerio_gene_ensembl.default.structure.gene_biotype|drerio_gene_ensembl.default.structure.exon_chrom_start|drerio_gene_ensembl.default.structure.exon_chrom_end|drerio_gene_ensembl.default.structure.is_constitutive|drerio_gene_ensembl.default.structure.rank&FILTERS=&VISIBLEPANEL=resultspanel

And here it is as XML.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
  <Dataset name = "drerio_gene_ensembl" interface = "default" >
    <Attribute name = "ensembl_gene_id" />
    <Attribute name = "ensembl_transcript_id" />
    <Attribute name = "start_position" />
    <Attribute name = "end_position" />
    <Attribute name = "transcript_start" />
    <Attribute name = "transcript_end" />
    <Attribute name = "strand" />
    <Attribute name = "chromosome_name" />
    <Attribute name = "external_gene_name" />
    <Attribute name = "gene_biotype" />
    <Attribute name = "exon_chrom_start" />
    <Attribute name = "exon_chrom_end" />
    <Attribute name = "is_constitutive" />
    <Attribute name = "rank" />
  </Dataset>
</Query>

The downloaded file was then transformed as follows.

x <- read.delim("~/Downloads/mart_export.txt", stringsAsFactors = FALSE)
e <- GRanges(paste0("chr", x$Chromosome.Name), IRanges(x$Exon.Chr.Start..bp., x$Exon.Chr.End..bp.), ifelse(x$Strand + 1, "+", "-"))
e$gene_name <- Rle(x$Associated.Gene.Name)
e$transcript_type <- Rle(x$Gene.type)
e$type <- "exon"
e$type <- Rle(e$type)

e <- GRanges(paste0("chr", x$Chromosome.Name), IRanges(x$Exon.Chr.Start..bp., x$Exon.Chr.End..bp.), ifelse(x$Strand + 1, "+", "-"))
e$gene_name <- Rle(x$Associated.Gene.Name)
e$transcript_type <- Rle(x$Gene.type)
e$type <- "exon"
e$type <- Rle(e$type)
e <- sort(unique(e))

g <- GRanges( paste0("chr", x$Chromosome.Name)
            , IRanges(x$Gene.Start..bp., x$Gene.End..bp.)
            , ifelse( x$Strand + 1, "+", "-"))
            
g$gene_name <- Rle(x$Associated.Gene.Name)
g$transcript_type <- Rle(x$Gene.type)
g$type <- "gene"
g$type <- Rle(g$type)
g <- sort(unique(g))

t <- GRanges( paste0("chr", x$Chromosome.Name)
            , IRanges(x$Transcript.Start..bp., x$Transcript.End..bp.)
            , ifelse( x$Strand + 1, "+", "-"))
            
t$gene_name <- Rle(x$Associated.Gene.Name)
t$transcript_type <- Rle(x$Gene.type)
t$type <- "transcript"
t$type <- Rle(t$type)
t <- sort(unique(t))

gff <- sort(c(g, t, e))
gff <- gff[seqnames(gff) == "chr17"]
gff <- gff[start(gff) > 26000000 & end(gff) < 54000000]
seqlevels(gff) <- seqlevelsInUse(gff)

save(gff, "data/exampleZv9_annot.RData", compress = "xz")

Author(s)

Prepared by Charles Plessy plessy@riken.jp using archive ENSEMBL data.

References

http://mar2015.archive.ensembl.org/biomart/

charles-plessy/CAGEr documentation built on Oct. 27, 2024, 10:11 p.m.

charles-plessy/CAGEr index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

charles-plessy/CAGEr
Analysis of CAGE (Cap Analysis of Gene Expression) sequencing data for precise mapping of transcription start sites and promoterome mining

exampleZv9_annot: Example zebrafish annotation data
In charles-plessy/CAGEr: Analysis of CAGE (Cap Analysis of Gene Expression) sequencing data for precise mapping of transcription start sites and promoterome mining

Example zebrafish annotation data

Description

Usage

Format

Details

Author(s)

References

Related to exampleZv9_annot in charles-plessy/CAGEr...

R Package Documentation

Browse R Packages

We want your feedback!

charles-plessy/CAGEr Analysis of CAGE (Cap Analysis of Gene Expression) sequencing data for precise mapping of transcription start sites and promoterome mining

exampleZv9_annot: Example zebrafish annotation data In charles-plessy/CAGEr: Analysis of CAGE (Cap Analysis of Gene Expression) sequencing data for precise mapping of transcription start sites and promoterome mining

Example zebrafish annotation data

Description

Usage

Format

Details

Author(s)

References

Related to exampleZv9_annot in charles-plessy/CAGEr...

R Package Documentation

Browse R Packages

We want your feedback!

charles-plessy/CAGEr
Analysis of CAGE (Cap Analysis of Gene Expression) sequencing data for precise mapping of transcription start sites and promoterome mining

exampleZv9_annot: Example zebrafish annotation data
In charles-plessy/CAGEr: Analysis of CAGE (Cap Analysis of Gene Expression) sequencing data for precise mapping of transcription start sites and promoterome mining