extractPatterns | R Documentation |
This function extracts methylation patterns (epialleles) for a given genomic region of interest.
extractPatterns(
bam,
bed,
bed.row = 1,
zero.based.bed = FALSE,
match.min.overlap = 1,
extract.context = c("CG", "CHG", "CHH", "CxG", "CX"),
min.context.freq = 0.01,
clip.patterns = FALSE,
strand.offset = c(CG = 1, CHG = 2, CHH = 0, CxG = 0, CX = 0)[extract.context],
highlight.positions = c(),
...,
verbose = TRUE
)
bam |
BAM file location string OR preprocessed output of
|
bed |
Browser Extensible Data (BED) file location string OR object of
class |
bed.row |
single non-negative integer specifying what 'bed' region should be included in the output (default: 1). |
zero.based.bed |
boolean defining if BED coordinates are zero based (default: FALSE). |
match.min.overlap |
integer for the smallest overlap between read's and
BED/ |
extract.context |
string defining cytosine methylation context used to report:
|
min.context.freq |
real number in the range [0;1] (default: 0.01). Genomic positions that are covered by smaller fraction of patterns (e.g., with erroneous context) won't be included in the report. |
clip.patterns |
boolean if patterns should not extend over the edge of 'bed' region (default: FALSE). |
strand.offset |
single non-negative integer specifying the offset of bases at the reverse (-) strand compared to the forward (+) strand. Allows to "merge" genomic positions when methylation is symmetric (in CG and CHG contexts). By default, equals 1 for 'extract.context'=="CG", 2 for "CHG", or 0 otherwise. |
highlight.positions |
integer vector with genomic positions of bases to include in every overlapping pattern. Allows to visualize the distribution of single-nucleotide variations (SNVs) among methylation patterns. 'highlight.positions' takes precedence if any of these positions overlap with within-the-context positions of methylation pattern. |
... |
other parameters to pass to the
|
verbose |
boolean to report progress and timings (default: TRUE). |
The function matches reads (for paired-end sequencing alignment files - read
pairs as a single entity) to the genomic
region provided in a BED file/GRanges
object, extracts
methylation statuses of bases within those reads, and returns a data frame
which can be used for further analysis and/or plotting of DNA methylation
patterns by plotPatterns
function.
data.table
object containing
per-read (pair) base methylation information for the genomic region of
interest. The report columns are:
seqnames – read (pair) reference sequence name
strand – read (pair) strand
start – start of the read (pair)
end – end of the read (pair)
nbase – number of within-the-context bases for this read (pair)
beta – beta value of this read (pair), calculated as a ratio of the number of methylated within-the-context bases to the total number of within-the-context bases
pattern – hex representation of 64-bit FNV-1a hash calculated for all reported base positions and bases in this read (pair). This hash value depends only on included genomic positions and their methylation call string chars (hHxXzZ) or nucleotides (ACGT, for highlighted bases only), thus it is expected to be unique for every methylation pattern, although equal for identical methylation patterns independently on read (pair) start, end, or strand (when correct 'strand.offset' is given)
... – columns for each genomic position that hold corresponding methylation call string char, or NA if position is not present in the read (pair)
plotPatterns
for pretty plotting of the output,
preprocessBam
for preloading BAM data,
generateCytosineReport
for methylation statistics at the level
of individual cytosines, generateBedReport
for genomic
region-based statistics, generateVcfReport
for evaluating
epiallele-SNV associations, generateBedEcdf
for analysing the
distribution of per-read beta values, and 'epialleleR' vignettes for the
description of usage and sample data.
# amplicon data
amplicon.bam <- system.file("extdata", "amplicon010meth.bam",
package="epialleleR")
amplicon.bed <- system.file("extdata", "amplicon.bed",
package="epialleleR")
# extract patterns
patterns <- extractPatterns(bam=amplicon.bam, bed=amplicon.bed, bed.row=3)
# and then plot them
plotPatterns(patterns)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.