NxtIRFdata-package | R Documentation |
This package contains files that provides a workable example for the
SpliceWiz package.
chrZ_genome()
chrZ_gtf()
example_bams(path = tempdir(), overwrite = FALSE, offline = FALSE)
get_mappability_exclusion(
genome_type = c("hg38", "hg19", "mm10", "mm9"),
as_type = c("GRanges", "bed", "bed.gz"),
path = tempdir(),
overwrite = FALSE,
offline = FALSE
)
path |
(Default = tempdir()) The desired destination path in which to place a copy of the files. The directory does not need to exist but its parent directory does. |
overwrite |
(Default = |
offline |
(Default = |
genome_type |
Either one of |
as_type |
(Default "GRanges") Whether to return the Mappability
exclusion data as a GRanges object |
(Update) Please note that NxtIRFcore is replaced by the SpliceWiz package which will be available from Bioconductor 3.16 onwards!
A synthetic reference, with genome sequence (FASTA) and gene annotation (GTF)
files are provided, based on the genes SRSF1, SRSF2, SRSF3, TRA2A, TRA2B,
TP53 and NSUN5. These genes, with an additional 100 flanking nucleotides,
were used to construct an artificial "chromosome Z" (chrZ).
Gene annotations,
based on release-94 of Ensembl GRCh38 (hg38), were modified with
genome coordinates corresponding to this artificial chromosome.
Accompanying this, an example dataset was created based on 6 samples from the
Leucegene dataset (GSE67039). Raw sequencing reads were downloaded from
GSE67039,
and were aligned to GRCh38 (Ensembl release-94) using STAR v2.7.3a. Then,
alignments belonging to the 7 genes of the chrZ genome were filtered, and the
nucleotide sequences of these alignments were realigned to the chrZ reference
using STAR.
Additionally, NxtIRFdata contains Mappability exclusion regions generated
using NxtIRF/SpliceWiz, suitable for use in
generating references based on hg38,
hg19, mm10 and mm9 genomes. These were generated empirically. Synthetic 70-nt
reads, with start distances 10-nt apart, were systematically generated from
the genome. These reads were aligned to the same genome using the STAR
aligner. Then, the BAM file read coverage was assessed.
Whereas mappable regions are expected to be covered with 7 reads,
low mappability regions are defined as regions covered with 4 or fewer
reads.
For chrZ_genome
and chrZ_gtf
: returns the path to the example genome
FASTA and gene annotation GTF files
For example_bams
: returns a vector specifying the location of the 6
example BAM files, copied to the given path
directory. Returns NULL if
a connection to ExperimentHub could not be established, or if some BAM
files could not be downloaded.
For get_mappability_exclusion
: returns the mappability exclusion regions
resource, with type as specified by the parameter as_type
. Returns NULL
if a connection to ExperimentHub could not be established, or if the
resource could not be downloaded.
chrZ_genome
: Returns the location of the genome.fa file of
the chrZ reference
chrZ_gtf
: Returns the location of the transcripts.gtf
file of the chrZ reference
example_bams
: Fetches data from ExperimentHub and places
them in the given path; returns the locations of the 6 example bam files
get_mappability_exclusion
: Fetches data from ExperimentHub and
places a copy in the given path;
returns the location of this Mappability exclusion
BED file
Generation of the mappability files was performed using NxtIRF/SpliceWiz using a method analogous to that described in:
Middleton R, Gao D, Thomas A, Singh B, Au A, Wong JJ, Bomane A, Cosson B, Eyras E, Rasko JE, Ritchie W. IRFinder: assessing the impact of intron retention on mammalian gene expression. Genome Biol. 2017 Mar 15;18(1):51. https://doi.org/10.1186/s13059-017-1184-4
# returns the location of the genome.fa file of the chrZ reference
genome_path <- chrZ_genome()
# returns the location of the transcripts.gtf file of the chrZ reference
gtf_path <- chrZ_gtf()
# Fetches data from ExperimentHub and places them in the given path
# returns the locations of the 6 example bam files
bam_paths <- example_bams(path = tempdir())
# Fetches data from AnnotationHub and places them in the given path
# returns the Mappability exclusion for hg38 directly as GRanges object
hg38.MapExcl.gr <- get_mappability_exclusion(
genome_type = "hg38",
as_type = "GRanges"
)
# returns the location of the Mappability exclusion gzipped BED for hg38
gzippedBEDpath <- get_mappability_exclusion(
genome_type = "hg38",
as_type = "bed.gz",
path = tempdir()
)
# Getting NxtIRFdata directly from ExperimentHub
require(ExperimentHub)
eh <- ExperimentHub()
NxtIRF_hub <- query(eh, "NxtIRF")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.