Quantify methylation of cytosines from bisulfite sequencing data.
1 2 3 4 5
report results combined for C's
, the default) or individually for single
reportLevel=“alignment”). The latter imposes
further restrictions on some arguments (see ‘Details’).
cytosine quantification mode, one of:
source of bam files; can be either “genome”
(then the alignments against the genome are used) or the name of an
auxiliary target sequence (then alignments against this target
sequence will be used). The auxiliary name must correspond to the
name contained in the auxiliary file refered by the
minimal mapping quality of alignments to be included when
counting (mapping quality must be greater than or equal to
maximal mapping quality of alignments to be included when
counting (mapping quality must be less than or equal to
a cluster object to be used for parallel processing of multiple files (see ‘Details’).
qMeth can be used on a
qProject object from a bisulfite
sequencing experiment (sequencing of bisulfite-converted DNA), such as
the one returned by
qAlign when its parameter
set to a different value than “no”.
qMeth quantifies DNA methylation by counting total and
methylated events for individual cytosines, using the alignments that
have been generated in converted (three-letter) sequence space for
qAlign. A methylated event corresponds to a C/C
match in the alignment, an unmethylated event to a T/C mismatch (or G/G
matches and A/G mismatches on the opposite strand). For paired-end
samples, the part of the left fragment alignment that overlaps
with the right fragment alignment is ignored, preventing the
use of redundant information coming from the same molecule.
Both directed (
bisulfite=“dir”) and undirected
bisulfite=“undir”) experimental protocols are supported
By default, results are returned per C nucleotide. If
reportLevel=“alignment”, results are reported separately
for individual alignments. In that case,
query has to be a
GRanges object with exactly one region,
mode has to be
either “CpG” or “allC”, the arguments
keepZero have no effect and allele-specific projects are
treated in the same way as normal (non-allele specific) projects.
Using the parameter
mode, quantification can be limited to
cytosines in CpG context, and counts obtained for the two cytosines on
opposite strands within a single CpG can be combined (summed).
The quantification of methylation for all cytosines in the query region(s)
mode=“allC”) should be done with care, especially for
large query regions, as the return value may require a large amount of
mode is set to “var”,
qMeth only counts reads
from the strand opposite of the cytosine and reports total and
matching alignments. For a position identical to the reference
sequence, only matches (and very few sequencing errors) are
expected, independent on the methylation state of the cytosine. A
reduced fraction of alignments matching the reference are indicative
of sequence variations in the sequenced sample.
mapqMax allow to select alignments
based on their mapping qualities.
take integer values between 0 and 255 and equal to
log10 Pr(mapping position is wrong), rounded to the nearest
integer. A value 255 indicates that the mapping quality is not available.
If an object that inherits from class
cluster is provided to
clObj argument, for example an object returned by
makeCluster from package parallel,
the quantification task is split into multiple chunks and processed in
clusterApplyLB from package
parallel. Not all tasks will be efficiently parallelized: For
example, a single query region and a single (group of) bam files will
not be split into multiple chunks.
GRanges object if
TRUE, otherwise a
Each row contains the coordinates of individual cytosines for
FALSE or query regions
In addition to the coordinates columns (or
strand slots for
each row contains per bam file:
Two values (total and methylated events, with suffixes _T and _M), or
qProject object was created including a SNP table,
six values (total and methylated events for Reference, Unknown and
Alternative genotypes, with suffixed _TR, _TU, _TA, _MR, _MU and _MA).
In the latter case, C's or CpG's that overlap with SNPs in the table
TRUE, groups of bam files with
identical sample name are combined (summed) and will be represented by
a single set of total and methylated count columns.
mode=“var”, the _T and _M columns correspond to total
and matching alignments overlapping the guanine paired to the
list with one
element per bam file or sample (depending on
collapseBySample). Each list element is another list with the
aid: character vector with unique alignment identifiers
Cid: integer vector with genomic coordinate of C base
strand: character vector with the strand of the C base
meth: integer vector with methylation state for
alignment and C defined by
Cid. The values are
1 for methylated or 0 for unmethylated states.
Anita Lerch, Dimos Gaidatzis and Michael Stadler
1 2 3 4 5 6 7 8 9 10 11 12
# copy example data to current working directory file.copy(system.file(package="QuasR", "extdata"), ".", recursive=TRUE) # create alignments sampleFile <- "extdata/samples_bis_single.txt" genomeFile <- "extdata/hg19sub.fa" proj <- qAlign(sampleFile, genomeFile, bisulfite="dir") proj # calculate methylation states meth <- qMeth(proj, mode="CpGcomb") meth
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.