Description Usage Format Details Author(s) References
Annotation data for zebrafish's chromosome 17's interval 26000000-54000000 (Zv9/danRer7 genome), to be used in documentation examples.
1 |
An object of class GRanges
of length 7467.
Data was retreived from ENSEMBL's Biomart server using a query to extract gene, transcripts and exon coordinates. For the record, here it is as URL (long, possibly overflowing).
http://mar2015.archive.ensembl.org/biomart/martview/78d86c1d6b4ef51568ba6d46f7d8b254?VIRTUALSCHEMANAME=default&ATTRIBUTES=drerio_gene_ensembl.default.structure.ensembl_gene_id|drerio_gene_ensembl.default.structure.ensembl_transcript_id|drerio_gene_ensembl.default.structure.start_position|drerio_gene_ensembl.default.structure.end_position|drerio_gene_ensembl.default.structure.transcript_start|drerio_gene_ensembl.default.structure.transcript_end|drerio_gene_ensembl.default.structure.strand|drerio_gene_ensembl.default.structure.chromosome_name|drerio_gene_ensembl.default.structure.external_gene_name|drerio_gene_ensembl.default.structure.gene_biotype|drerio_gene_ensembl.default.structure.exon_chrom_start|drerio_gene_ensembl.default.structure.exon_chrom_end|drerio_gene_ensembl.default.structure.is_constitutive|drerio_gene_ensembl.default.structure.rank&FILTERS=&VISIBLEPANEL=resultspanel
And here it is as XML.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
<Dataset name = "drerio_gene_ensembl" interface = "default" >
<Attribute name = "ensembl_gene_id" />
<Attribute name = "ensembl_transcript_id" />
<Attribute name = "start_position" />
<Attribute name = "end_position" />
<Attribute name = "transcript_start" />
<Attribute name = "transcript_end" />
<Attribute name = "strand" />
<Attribute name = "chromosome_name" />
<Attribute name = "external_gene_name" />
<Attribute name = "gene_biotype" />
<Attribute name = "exon_chrom_start" />
<Attribute name = "exon_chrom_end" />
<Attribute name = "is_constitutive" />
<Attribute name = "rank" />
</Dataset>
</Query>
|
The downloaded file was then transformed as follows.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | x <- read.delim("~/Downloads/mart_export.txt", stringsAsFactors = FALSE)
e <- GRanges(paste0("chr", x$Chromosome.Name), IRanges(x$Exon.Chr.Start..bp., x$Exon.Chr.End..bp.), ifelse(x$Strand + 1, "+", "-"))
e$gene_name <- Rle(x$Associated.Gene.Name)
e$transcript_type <- Rle(x$Gene.type)
e$type <- "exon"
e$type <- Rle(e$type)
e <- GRanges(paste0("chr", x$Chromosome.Name), IRanges(x$Exon.Chr.Start..bp., x$Exon.Chr.End..bp.), ifelse(x$Strand + 1, "+", "-"))
e$gene_name <- Rle(x$Associated.Gene.Name)
e$transcript_type <- Rle(x$Gene.type)
e$type <- "exon"
e$type <- Rle(e$type)
e <- sort(unique(e))
g <- GRanges( paste0("chr", x$Chromosome.Name)
, IRanges(x$Gene.Start..bp., x$Gene.End..bp.)
, ifelse( x$Strand + 1, "+", "-"))
g$gene_name <- Rle(x$Associated.Gene.Name)
g$transcript_type <- Rle(x$Gene.type)
g$type <- "gene"
g$type <- Rle(g$type)
g <- sort(unique(g))
t <- GRanges( paste0("chr", x$Chromosome.Name)
, IRanges(x$Transcript.Start..bp., x$Transcript.End..bp.)
, ifelse( x$Strand + 1, "+", "-"))
t$gene_name <- Rle(x$Associated.Gene.Name)
t$transcript_type <- Rle(x$Gene.type)
t$type <- "transcript"
t$type <- Rle(t$type)
t <- sort(unique(t))
gff <- sort(c(g, t, e))
gff <- gff[seqnames(gff) == "chr17"]
gff <- gff[start(gff) > 26000000 & end(gff) < 54000000]
seqlevels(gff) <- seqlevelsInUse(gff)
save(gff, "data/exampleZv9_annot.RData", compress = "xz")
|
Prepared by Charles Plessy plessy@riken.jp using archive ENSEMBL data.
http://mar2015.archive.ensembl.org/biomart/
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.