callDMR: Function to detect differntially methylated regions (DMR)...

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/DMR.R

Description

This function takes the results from DML testing procedure ('callDML' function) and calls DMRs. Regions will CpG sites being statistically significant are detected as DMRs. Nearby DMRs are merged into longer ones. Some restrictions including the minimum length, minimum number of CpG sites, etc. are applied.

Usage

1
2
callDMR(DMLresult, delta=0, p.threshold=1e-5,
        minlen=50, minCG=3, dis.merge=100, pct.sig=0.5)

Arguments

DMLresult

A data frame representing the results for DML detection. This should be the result returned from 'DMLtest' or 'DMLtest.multiFactor' function.

delta

A threshold for defining DMR. In DML detection procedure, a hypothesis test that the two groups means are equal is conducted at each CpG site. Here if 'delta' is specified, the function will compute the posterior probability that the difference of the means are greater than delta, and then construct DMR based on that. This only works when the test results are from 'DMLtest', which is for two-group comparison.

p.threshold

A threshold of p-values for calling DMR. Loci with p-values less than this threshold will be picked and joint to form the DMRs. See 'details' for more information.

minlen

Minimum length (in basepairs) required for DMR. Default is 50 bps.

minCG

Minimum number of CpG sites required for DMR. Default is 3.

dis.merge

When two DMRs are very close to each other and the distance (in bps) is less than this number, they will be merged into one. Default is 50 bps.

pct.sig

In all DMRs, the percentage of CG sites with significant p-values (less than p.threshold) must be greater than this threshold. Default is 0.5. This is mainly used for correcting the effects of merging of nearby DMRs.

Details

The choices of 'delta' and 'p.threshold' are somewhat arbitrary. The default value for p-value threshold for calling DMR is 1e-5. The statistical test on loci level is less powerful when smoothing is NOT applied, so users can consider to use a less stringent criteria, such as 0.001, in order to get satisfactory number of DMRs. This function is reasonably fast since the computationally intesnsive part is in 'DMLtest'. Users can try different p.threshold values to obtain satisfactory results.

'delta' is only supported when the experiment is for two-group comparison. This is because in multifactor design, the estimated coefficients in the regression are based on a GLM framework (loosely speaking), thus they don't have clear meaning of methylation level differences. So when the input DMLresult is from DMLtest.multiFactor, 'delta' cannot be specified.

When specifying a 'delta' value, the posterior probability (pp) of each CpG site being DML is computed. Then the p.threshold is applied on 1-pp, e.g., sites with 1-pp<p.threshold is deemed significant. In this case, the criteria for DMR calling is more stringent and users might consider to use a more liberal p.threshold in order to get more regions.

Value

A data frame for DMRs. Each row is for a DMR. Rows are sorted by "areaStat", which is the sum of test statistics of all CpG sites in the region. The columns are:

chr

Chromosome number.

start, end

Genomic coordinates.

length

Length of the DMR, in bps.

nCG

Number of CpG sites contained in the DMR.

meanMethy1, meanMethy2

Average methylation levels in two conditions.

diff.Methy

The difference in the methylation levels between two conditions.

areaStat

The sum of the test statistics of all CpG sites within the DMR.

Author(s)

Hao Wu <hao.wu@emory.edu>

See Also

DMLtest, callDML

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
## Not run: 
require(bsseq)
require(bsseqData)
data(BS.cancer.ex)

## take a small portion of data and test
BSobj <- BS.cancer.ex[140000:150000,]
dmlTest <- DMLtest(BSobj, group1=c("C1", "C2", "C3"), group2=c("N1","N2","N3"),
   smoothing=TRUE, smoothing.span=500)

## call DMR based on test results
dmrs <- callDMR(dmlTest)
head(dmrs)

## or one can specify a threshold for difference in methylation level
dmrs2 <- callDMR(dmlTest, delta=0.1)
head(dmrs2)

## visualize one DMR
showOneDMR(dmrs[1,], BSobj)

## from multifactor design - using a loose threshold to demonstrate

data(RRBS)
DMLfit = DMLfit.multiFactor(RRBS, design, ~case+cell+case:cell)
DMLtest.cell = DMLtest.multiFactor(DMLfit, coef="cellrN")
dmr = callDMR(DMLtest.cell, p.threshold=0.05) 
dmr

## End(Not run)

DSS documentation built on Nov. 8, 2020, 7:44 p.m.