Mappability-methods | R Documentation |
These functions empirically calculate low-mappability (Mappability Exclusion) regions using the given genome FASTA file. A splice-aware alignment software capable of aligning reads to the genome is required. See details and examples below.
Mappability_GenReads( reference_path, read_len = 70, read_stride = 10, error_pos = 35, verbose = TRUE, alt_fasta_file ) Mappability_CalculateExclusions( reference_path, aligned_bam = file.path(reference_path, "Mappability", "Aligned.out.bam"), threshold = 4, n_threads = 1 )
reference_path |
The directory of the reference prepared by
|
read_len |
The nucleotide length of the synthetic reads |
read_stride |
The nucleotide distance between adjacent synthetic reads |
error_pos |
The position of the procedurally-generated nucleotide error from the start of each synthetic reads |
verbose |
Whether additional status messages are shown |
alt_fasta_file |
(Optional) The path to the user-supplied genome fasta
file, if different to that found inside the |
aligned_bam |
The BAM file of alignment of the synthetic reads generated
by |
threshold |
Genomic regions with this alignment read depth (or below) in the aligned synthetic read BAM are defined as low mappability regions. |
n_threads |
The number of threads used to calculate mappability exclusion regions from aligned bam file of synthetic reads. |
Creating a Mappability Exclusion BED file is a three-step process.
First, using Mappability_GenReads()
,
synthetic reads are systematically generated using the given genome contained
within reference_path
.
Second, an aligner such as STAR (preferably the same aligner used for the subsequent RNA-seq experiment) is required to align these reads to the source genome. Poorly mapped regions of the genome will be reflected by regions of low coverage depth.
Finally, the BAM file containing the aligned reads is analysed
using Mappability_CalculateExclusions()
, to identify
low-mappability regions to compile the Mappability Exclusion BED file.
It is recommended to leave all parameters to their default settings. Regular
users should only specify reference_path
, aligned_bam
and n_threads
,
as required.
NB: STAR_Mappability runs all 3 steps required, using the STAR
aligner.
This only works in systems where STAR
is installed.
NB2: In systems where STAR
is not available, consider using HISAT2 or
Rsubread. A working example using Rsubread is shown below.
For Mappability_GenReads
: writes Reads.fa
to the Mappability
subdirectory inside the given reference_path
.
For Mappability_CalculateExclusions
: writes a gzipped BED file
named MappabilityExclusion.bed.gz
to the Mappability
subdirectory
inside reference_path
.
This BED file is automatically used by BuildReference()
if
MappabilityRef
is not specified.
Mappability_GenReads
: Generates synthetic reads from a
genome FASTA file, for mappability calculations.
Mappability_CalculateExclusions
: Generate a BED file defining
low mappability regions, using reads generated by
Mappability_GenReads()
, aligned to the genome.
BuildReference
# (1a) Creates genome resource files ref_path <- file.path(tempdir(), "Reference") GetReferenceResource( reference_path = ref_path, fasta = chrZ_genome(), gtf = chrZ_gtf() ) # (1b) Systematically generate reads based on the NxtIRF example genome: Mappability_GenReads( reference_path = ref_path ) ## Not run: # (2) Align the generated reads using Rsubread: # (2a) Build the Rsubread genome index: setwd(ref_path) Rsubread::buildindex(basename = "./reference_index", reference = chrZ_genome()) # (2b) Align the synthetic reads using Rsubread::subjunc() Rsubread::subjunc( index = "./reference_index", readfile1 = file.path(ref_path, "Mappability", "Reads.fa"), output_file = file.path(ref_path, "Mappability", "AlignedReads.bam"), useAnnotation = TRUE, annot.ext = chrZ_gtf(), isGTF = TRUE ) # (3) Analyse the aligned reads in the BAM file for low-mappability regions: Mappability_CalculateExclusions( reference_path = ref_path, aligned_bam = file.path(ref_path, "Mappability", "AlignedReads.bam") ) # (4) Build the NxtIRF reference using the calculated Mappability Exclusions BuildReference(ref_path) # NB the default is to search for the BED file generated by # `Mappability_CalculateExclusions()` in the given reference_path ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.