find_dames: Find DAMEs

View source: R/find_dames.R

find_damesR Documentation

Find DAMEs

Description

This function finds Differential Allele-specific MEthylated regions (DAMEs). It uses the regionFinder function from bumphunter, and asigns p-values either empirically or using the Simes method.

Usage

find_dames(
  sa,
  design,
  coef = 2,
  contrast = NULL,
  smooth = TRUE,
  Q = 0.5,
  pvalAssign = "simes",
  maxGap = 20,
  verbose = TRUE,
  maxPerms = 10,
  method = "ls",
  trend = FALSE,
  ...
)

Arguments

sa

A SummarizedExperiment containing ASM values where each row correspond to a tuple/site and a column to sample/replicate.

design

A design matrix created with model.matrix.

coef

Column in design specifying the parameter to estimate. Default = 2.

contrast

a contrast matrix, generated with makeContrasts.

smooth

Whether smoothing should be applied to the t-Statistics. Default = TRUE.

Q

The percentile set to get a cutoff value K. K is the value on the Qth quantile of the absolute values of the given (smoothed) t-statistics. Only necessary if pvalAssign = 'empirical'. Default = 0.5.

pvalAssign

Choose method to assign pvalues, either 'simes' (default) or 'empirical'. This second one performs maxPerms number of permutations to calculate null statistics, and runs regionFinder.

maxGap

Maximum gap between CpGs in a cluster (in bp). NOTE: Regions can be as small as 1 bp. Default = 20.

verbose

If the function should be verbose. Default = TRUE.

maxPerms

Maximum possible permutations generated. Only necessary if pvalAssign = 'empirical'. Default = 10.

method

The method to be used in limma's lmFit. The default is set to 'ls' but can also be set to 'robust', which is recommended on a real data set.

trend

Passed to eBayes. Should an intensity-trend be allowed for the prior variance? Default is that the prior variance is constant, e.g. FALSE.

...

Arguments passed to get_tstats.

Details

The simes method has higher power to detect DAMEs, but the consistency in signal across a region is better controlled with the empirical method, since it uses regionFinder and getSegments to find regions with t-statistics above a cuttof (controled with parameter Q), whereas with the 'simes' option, we initially detects clusters of CpG sites/tuples, and then test if at least 1 differential site/tuple is present in the cluster.

We recommend trying out different maxGap and Q parameters, since the size and the effect-size of obtained DAMEs change with these parameters.

Value

A data frame of detected DAMEs ordered by the p-value. Each row is a DAME and the following information is provided in the columns (some column names change depending on the pvalAssign choice):

  • chr: on which chromosome the DAME is found.

  • start: The start position of the DAME.

  • end: The end position of the DAME.

  • pvalSimes: p-value calculated with the Simes method.

  • pvalEmp: Empirical p-value obtained from permuting covariate of interest.

  • sumTstat: Sum of t-stats per segment/cluster.

  • meanTstat: Mean of t-stats per segment/cluster.

  • segmentL: Size of segmented cluster (from getSegments).

  • clusterL: Size of original cluster (from clusterMaker).

  • FDR: Adjusted p-value using the method of Benjamini, Hochberg. (from p.adjust).

  • numup: Number of sites with ASM increase in cluster (only for Simes).

  • numdown: Number of sites with ASM decrease in cluster (only for Simes).

Examples

data(readtuples_output)
ASM <- calc_asm(readtuples_output)
grp <- factor(c(rep('CRC',3),rep('NORM',2)), levels = c('NORM', 'CRC'))
mod <- model.matrix(~grp)
dames <- find_dames(ASM, mod, verbose = FALSE)


markrobinsonuzh/DAMEfinder documentation built on April 7, 2023, 6:37 a.m.