dmrcate: DMR identification

Description Usage Arguments Details Value Author(s) References Examples

Description

The main function of this package. Computes a kernel estimate against a null comparison to identify significantly differentially (or variable) methylated regions in hg19.

Usage

1
2
3
4
5
6
7
8
9
dmrcate(object, 
        lambda = 1000,
        C=2,
        p.adjust.method = "BH", 
        pcutoff = "limma", 
        consec = FALSE, 
        conseclambda = 10, 
        betacutoff = NULL
        ) 

Arguments

object

A class of type "annot", created from cpg.annotate.

lambda

Gaussian kernel bandwidth for smoothed-function estimation. Also informs DMR bookend definition; gaps >= lambda between significant probes will be in separate DMRs. Support is truncated at 5*lambda. Default is 1000 nucleotides. See details for further info.

C

Scaling factor for bandwidth. Gaussian kernel is calculated where lambda/C = sigma. Empirical testing shows that when lambda=1000, near-optimal prediction of sequencing-derived DMRs is obtained when C is approximately 2, i.e. 1 standard deviation of Gaussian kernel = 500 base pairs. Cannot be < 0.2.

p.adjust.method

Method for p-value adjustment from the significance test. Default is "BH" (Benjamini-Hochberg).

pcutoff

p-value cutoff to determine DMRs. Default is automatically determined by the number of significant probes returned by limma for that contrast, but can be set manually with a numeric value.

consec

Use DMRcate in consecutive probe mode. Treats CpG sites as equally spaced.

conseclambda

Bandwidth in probes (rather than nucleotides) to use when consec=TRUE. When specified the variable lambda simply becomes the minumum distance separating DMRs.

betacutoff

Optional filter; removes any region from the results that does not have at least one CpG site with a beta fold change exceeding this value.

Details

The values of lambda and C should be chosen with care. We recommend that half a kilobase represent 1 standard deviation of support (lambda=1000 and C=2). If lambda is too small or C too large then the kernel estimator will not have enough support to significantly differentiate the weighted estimate from the null distribution. If lambda is too large then dmrcate will report very long DMRs spanning multiple gene loci, and the large amount of support will likely give Type I errors. If you are concerned about Type I errors we recommend using the default value of pcutoff, although this will return no DMRs if no DM probes are returned by limma either.

Many gene loci have lengths reaching into the hundreds of thousands of base pairs, so it is quite possible that multiple signficant regions will have identical values in results$gene_assoc. This is fine; these regions are distinct in that they are at the very least lambda nucleotides apart, and is preferable to attempting collapse into a super-DMR by increasing lambda.

Value

A list containing 2 data frames (input and results) and a numeric value (cutoff). input contains the contents of the annot object, plus calculated p-values:

results contains an annotated data.frame of significant regions, ranked by minpval:

cutoff is the signficance p-value cutoff provided in the call to dmrcate.

Author(s)

Tim J. Peters <Tim.Peters@csiro.au>, Mike J. Buckley <Mike.Buckley@csiro.au>, Tim Triche Jr. <tim.triche@usc.edu>

References

Peters T.J., Buckley M.J., Statham, A., Pidsley R., Samaras K., Lord R.V., Clark S.J. and Molloy P.L. De novo identification of differentially methylated regions in the human genome. Epigenetics & Chromatin 2015, 8:6, doi:10.1186/1756-8935-8-6

Wand, M.P. & Jones, M.C. (1995) Kernel Smoothing. Chapman & Hall.

Duong T. (2013) Local significant differences from nonparametric two-sample tests. Journal of Nonparametric Statistics. 2013 25(3), 635-645.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
data(dmrcatedata)
myMs <- logit2(myBetas)
myMs.noSNPs <- rmSNPandCH(myMs, dist=2, mafcut=0.05)
patient <- factor(sub("-.*", "", colnames(myMs)))
type <- factor(sub(".*-", "", colnames(myMs)))
design <- model.matrix(~patient + type) 
myannotation <- cpg.annotate(myMs.noSNPs, analysis.type="differential",
    design=design, coef=39)
dmrcoutput <- dmrcate(myannotation, lambda=1000)

## End(Not run)

timpeters82/DMRcate-release documentation built on May 31, 2019, 2:29 p.m.