Identifying Differentially Methylated Regions (DMRs)

Share:

Description

Identifying differentially methylated regions for pairwise or multiple samples comparision.

Usage

1
2
3
4
## S4 method for signature 'methylPipe,BSdataSet'
findDMR(object, Nproc=NULL, ROI=NULL,
pmdGRanges=NULL, MCClass='mCG', dmrSize=10, dmrBp=1000, binsize=0,
eprop=0.3, coverage=1, Pvalue=NULL, SNPs=NULL)

Arguments

object

An object of class BSdataSet

Nproc

numeric; the number of processors to use, one chromosome is ran for each processor

ROI

character; either NULL or an object of class GRanges consisting of genomic regions of interest for which DMRs are identified

pmdGRanges

a GRanges object containing the genomic coordinates of Partially Methylated Domains that will be masked

MCClass

character; the mC sequence context to be considered, one of all, mCG, mCHG or mCHH

dmrSize

numeric; the number of consecutive mC to be simulataneously considered; atleast 5

dmrBp

numeric; the max number of bp containing the dmrSize mC

binsize

numeric; the size of the bin used for smoothing the methylation levels, useful for nonCG methylation in human

eprop

numeric; the max - min methylation level is determined for each mC, or for each bin, and only mC (or bins) with difference greater than eprop are considered

coverage

numeric; the minimum number of total reads at a given cytosine genomic position

Pvalue

numeric; to select only those mC with significant p-value

SNPs

GRanges; if SNPs information is provided those cytosine are removed from DMR computation

Details

Typically for nonCG methylation in human a dmrSize of 50, a dmrBp of 50000 and a binsize of 1000 are used. For CpG methylation in human and both CpG and nonCG methylation in plants the default settings are usually fine. Partially Methylated Domains are usually found in differentiated cells and can constitute up to one third of the genome (Lister R et al, Nature 2009). Usually DMRs are not selected within those regions especially when comparing differentiated and pluripotent cells. Eprop is used to speed up the analysis. According to the number of samples different test are used to compare the methylation levels (percentage of methylated reads for each mC). In case of two samples the non parametric wilcoxon test is used. In case of more than two samples the kruskal wallis non parametric testis used. Check consolidateDMRs to further process and finalize the list of DMRs.

Value

A GRanges object of DMRs with the metadata slots for pValue, MethDiff_Perc and log2Enrichment. When two samples are compared, MethDiff_Perc is the diference between percentage methylation between the conditions compared. However, log2Enrichment is the log2ratio between the mean for the samples.

Author(s)

Mattia Pelizzola, Kamal Kishore

See Also

consolidateDMRs

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
require(BSgenome.Hsapiens.UCSC.hg18)
uncov_GR <- GRanges(Rle('chr20'), IRanges(c(14350,69251,84185), c(18349,73250,88184)))
H1data <- system.file('extdata', 'H1_chr20_CG_10k_tabix_out.txt.gz', package='methylPipe')
H1.db <- BSdata(file=H1data, uncov=uncov_GR, org=Hsapiens)
IMR90data <- system.file('extdata', 'IMR90_chr20_CG_10k_tabix_out.txt.gz', package='methylPipe')
IMR90.db <- BSdata(file=IMR90data, uncov=uncov_GR, org=Hsapiens)
H1.IMR90.set <- BSdataSet(org=Hsapiens, group=c("C","E"), IMR90=IMR90.db, H1=H1.db)
gr_file <- system.file('extdata', 'GR_chr20.Rdata', package='methylPipe')
load(gr_file)
DMRs <- findDMR(object= H1.IMR90.set, Nproc=1, ROI=GR_chr20, MCClass='mCG',
dmrSize=10, dmrBp=1000, eprop=0.3)
head(DMRs)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.