dmpClustering: DMP Clustering

View source: R/dmpClustering.R

dmpClusteringR Documentation

DMP Clustering

Description

Given a 'pDMP' object carrying DMPs obtained in Methyl-IT downstream analysis, function 'dmpClustering' build clusters of DMPs, which can be further tested to identify differentially methylated regions (DMRs) with countTest2 function.

Usage

dmpClustering(
  dmps,
  win.size = NULL,
  step.size = NULL,
  minNumDMPs = 1,
  maxClustDist = NULL,
  method = c("relaxed", "fixed.int"),
  ignore.strand = TRUE,
  verbose = FALSE
)

Arguments

dmps

An object from 'pDMP' class, which is returned by selectDIMP function or simply a GRanges object carrying DMP coordinates.

win.size

An integer. The size of the windows/intervals genomics. Default: win.size = 1.

step.size

Interval at which the regions/windows must be defined. Default: step.size = 1.

minNumDMPs

Minimum number of DMPs inside of each cluster. Default: minNumDMPs = 1.

maxClustDist

Clusters separated by a distance lesser than 'maxClustDist' positions are merged. Default: maxClustDist = NULL.

method

Two different approaches are implemented to clustering DMPs:

"relaxed":

DMP ranges which are separated by a distance less than 'maxClustDist' are merged and ranges with less than 'minNumDMPs' are removed.

"fixed.int":

A partition of the ranges covered by the DMPs is built at fixed intervals 'win.size' and at fixed step 'step.size'. next, ranges which are separated by a distance less than 'maxClustDist' are merged and ranges with less than 'minNumDMPs' are removed.

ignore.strand

Same as in findOverlaps-methods.

verbose

if TRUE, prints the function log to stdout

Details

The number of DMPs reported in each cluster corresponds to the numbers of sites inside the cluster where DMPs were found in at least one of the samples (from control or from treatment). That is, dmpClustering is just a tool to locate regions with high density of DMPs from all the samples. It does not detect DMRs. It is assumed that only DMP coordinates are given in the 'dmps' object. That is, all the sites provided are considered in the computation.

Value

A GRanges object carrying the coordinates of DMP clusters from all the samples and the number of DMPs on each of them.

Author(s)

Robersy Sanchez (https://github.com/genomaths).

Examples

## Creates a GRanges object carrying DMPs. Notice that only the DMP 
## coordinates are needed.
gr <- GRanges(seqnames = Rle( c('chr1', 'chr2', 'chr3', 'chr4'),
            c(5, 5, 5, 5)),
            ranges = IRanges(start = 1:20, end = 1:20),
            strand = rep(c('+', '-'), 10))

## Simple DMP clustering ignoring the DNA strand
dmpClustering(gr, win.size = 4,  step.size = 4, minNumDMPs = 2,
              method = "fixed.int")

## Now, the information on the DNA strand is included in the clustering
dmpClustering(dmps = gr, win.size = 4,  step.size = 4, minNumDMPs = 2,
              method = "fixed.int", ignore.strand = FALSE)

## Next, as before adding that clusters separated by a distance lesser than
## 'maxClustDist = 2' will be merged
dmpClustering(dmps = gr, win.size = 4,  step.size = 4, minNumDMPs = 2,
              method = "fixed.int", maxClustDist = 2, ignore.strand = FALSE)

## Finally, the relaxed approach. Notice that only two parameter values are 
## needed
dmpClustering(gr, minNumDMPs = 2, maxClustDist = 2,
              method = "relaxed", ignore.strand = FALSE)

genomaths/MethylIT.utils documentation built on July 4, 2023, 12:05 a.m.