getAMR: Search for aberrantly methylated regions

getAMRR Documentation

Search for aberrantly methylated regions

Description

'getAMR' returns a 'GRanges' object with all the aberrantly methylated regions (AMRs) for all samples in a data set.

Usage

getAMR(
  data.ranges,
  data.samples = NULL,
  ramr.method = c("IQR", "beta", "wbeta", "beinf"),
  iqr.cutoff = 5,
  pval.cutoff = 0.05,
  qval.cutoff = NULL,
  merge.window = 300,
  min.cpgs = 7,
  min.width = 1,
  exclude.range = NULL,
  cores = max(1, parallel::detectCores() - 1),
  verbose = TRUE,
  ...
)

Arguments

data.ranges

A 'GRanges' object with genomic locations and corresponding beta values included as metadata.

data.samples

A character vector with sample names (a subset of metadata column names). If 'NULL' (the default), then all samples (metadata columns) are included in the analysis.

ramr.method

A character scalar: when ramr.method is "IQR" (the default), the filtering based on interquantile range is used ('iqr.cutoff' value is then used as a threshold). When "beta", "wbeta" or "beinf" - filtering based on fitting non-weighted ('EnvStats::ebeta'), weighted ('ExtDist::eBeta') or zero-and-one inflated ('gamlss.dist::BEINF') beta distributions, respectively, is used, and 'pval.cutoff' or 'qval.cutoff' (if not 'NULL') is used as a threshold. For "wbeta", weights directly correlate with bin contents (number of values per bin) and inversly - with the distances from the median value, thus narrowing the estimated distribution and emphasizing outliers.

iqr.cutoff

A single integer >= 1. Methylation beta values differing from the median value by more than 'iqr.cutoff' interquartile ranges are considered to be significant (the default: 5).

pval.cutoff

A numeric scalar (the default: 5e-2). Bonferroni correction of 'pval.cutoff' by the length of the 'data.samples' object is used to calculate 'qval.cutoff' if the latter is 'NULL'.

qval.cutoff

A numeric scalar. Used as a threshold for filtering based on fitting non-weighted or weighted beta distributions: all p-values lower than 'qval.cutoff' are considered to be significant. If 'NULL' (the default), it is calculated using 'pval.cutoff'

merge.window

A positive integer. All significant (survived the filtering stage) 'data.ranges' genomic locations within this distance will be merged to create AMRs (the default: 300).

min.cpgs

A single integer >= 1. All AMRs containing less than 'min.cpgs' significant genomic locations are filtered out (the default: 7).

min.width

A single integer >= 1 (the default). Only AMRs with the width of at least 'min.width' are returned.

exclude.range

A numeric vector of length two. If not 'NULL' (the default), all 'data.ranges' genomic locations with their median methylation beta value within the 'exclude.range' interval are filtered out.

cores

A single integer >= 1. Number of processes for parallel computation (the default: all but one cores). Results of parallel processing are fully reproducible when the same seed is used (thanks to doRNG).

verbose

boolean to report progress and timings (default: TRUE).

...

Further arguments to be passed to 'EnvStats::ebeta' or 'ExtDist::eBeta' functions.

Details

In the provided data set, 'getAMR' compares methylation beta values of each sample with other samples to identify rare long-range methylation aberrations (epimutations). For 'ramr.method=="IQR"': for every genomic location (CpG) in 'data.ranges' the IQR-normalized deviation from the median value is calculated, and all CpGs with such normalized deviation not smaller than the 'iqr.cutoff' are retained. For 'ramr.method distribution are estimated by means of 'EnvStats::ebeta' (beta distribution), 'ExtDist::eBeta' (weighted beta destribution), or 'gamlss.dist::BEINF' (zero and one inflated beta distribution) functions, respectively. These parameters are then used to calculate the probability values, followed by the filtering when all CpGs with p-values not greater than 'qval.cutoff' are retained. Another filtering is then performed to exclude all CpGs within 'exclude.range'. Next, the retained (significant) CpGs are merged within the window of 'merge.window', and final filtering is applied to AMR genomic ranges (by 'min.cpgs' and 'min.width').

Value

The output is a 'GRanges' object that contains all the aberrantly methylated regions (AMRs) for all 'data.samples' samples in 'data.ranges' object. The following metadata columns may be present:

  • 'revmap' – integer list of significant CpGs ('data.ranges' genomic locations) that are included in this AMR region

  • 'ncpg' – number of significant CpGs within this AMR region

  • 'sample' – contains an identifier of a sample to which corresponding AMR belongs

  • 'dbeta' – average deviation of beta values for significant CpGs from their corresponding median values

  • 'pval' – geometric mean of p-values for significant CpGs

  • 'xiqr' – average IQR-normalised deviation of beta values for significant CpGs from their corresponding median values

See Also

plotAMR for plotting AMRs, getUniverse for info on enrichment analysis, simulateAMR and simulateData for the generation of simulated test data sets, and 'ramr' vignettes for the description of usage and sample data.

Examples

  data(ramr)
  getAMR(ramr.data, ramr.samples, ramr.method="beta",
         min.cpgs=5, merge.window=1000, qval.cutoff=1e-3, cores=2)

BBCG/ramr documentation built on Dec. 17, 2024, 3:49 p.m.