CellDMC: A function that allows the identification of differentially...

View source: R/CellDMC.R

CellDMCR Documentation

A function that allows the identification of differentially methylated cell-types in in Epigenome-Wide Association Studies(EWAS)

Description

An outstanding challenge of Epigenome-Wide Association Studies performed in complex tissues is the identification of the specific cell-type(s) responsible for the observed differential methylation. CellDMC is a novel statistical algorithm, which is able to identify not only differentially methylated positions, but also the specific cell-type(s) driving the methylation change.

Usage

CellDMC(
  beta.m,
  pheno.v,
  frac.m,
  adjPMethod = "fdr",
  adjPThresh = 0.05,
  cov.mod = NULL,
  sort = FALSE,
  mc.cores = 1
)

Arguments

beta.m

A beta value matrix with rows labeling the CpGs and columns labeling samples.

pheno.v

A vector of phenotype. CellDMC can handle both of binary and continuous/oderinal phenotypes. NA is not allowed in pheno.v.

frac.m

A matrix contains fractions of each cell-type. Each row labels a sample, with the same order of the columns in beta.m. Each column labels a cell-type. Column names, which are the names of cell-types, are required. The rowSums of frac.m should be 1 or close to 1.

adjPMethod

The method used to adjust p values. The method can be any of method accepted by p.adjust.

adjPThresh

A numeric value, default as 0.05. This is used to call DMCTs. For each cell-type respectively, the CpG with the adjusted p values less than this threshold will be reported as DMCTs (-1 or 1) in the 'dmct' matrix in the returned list.

cov.mod

A design matrix from model.matrix, which contains other covariates to be adjusted. For example, input model.matrix(~ geneder, data = pheno.df) to adjust gender. Do not put cell-type fraction here!

sort

Default as FALSE. If TRUE, the data.frame in coe list will be sorted based on p value of each CpG. The order of rows in 'dmct' will not change since the orders of each cell-type are different.

mc.cores

The number of cores to use, i.e. at most how many threads will run simultaneously. The defatul is 1, which means no parallelization.

Value

A list with the following two items.

dmct A matrix gives wheter the input CpGs are DMCTs and DMCs. The first column tells whether a CpG is a DMC or not. If the CpG is called as DMC, the value will be 1, otherwise it is 0. The following columns give DMCTs for each cell-type. If a CpG is a DMCT, the value will be 1 (hypermethylated for case compared to control) or -1 (hypomethylated for case compared to control). Otherwise, the value is 0 (non-DMCT). The rows of this matrix are ordered as the same as that of the input beta.m.

coe This list contains several dataframes, which correspond to each cell-type in frac.m. Each dataframe contains all CpGs in input beta.m. All dataframes contain estimated DNAm changes (Estimate), standard error (SE), estimated t statistics (t), raw P values (p), and multiple hypothesis corrected P values (adjP).

References

Zheng SC, Breeze CE, Beck S, Teschendorff AE. Identification of differentially methylated cell-types in Epigenome-Wide Association Studies. Nat Methods (2018) 15: 1059-1066 doi:10.1038/s41592-018-0213-x.

Examples

data(centEpiFibIC.m)
data(DummyBeta.m)
out.l <- epidish(DummyBeta.m, centEpiFibIC.m, method = 'RPC')
frac.m <- out.l$estF
pheno.v <- rep(c(0, 1), each = 5)
celldmc.o <- CellDMC(DummyBeta.m, pheno.v, frac.m) 
# Pls note this is a faked beta value matrix.



sjczheng/EpiDISH documentation built on Nov. 16, 2024, 11:54 a.m.