hepidish: Hierarchical EpiDISH (HEpiDISH)

View source: R/hepidish.R

hepidishR Documentation

Hierarchical EpiDISH (HEpiDISH)

Description

HEpiDISH is an iterative hierarchical procedure of EpiDISH. HEpiDISH uses two distinct DNAm references, a primary reference for the estimation of several cell-types fractions, and a separate secondary non-overlapping DNAm reference for the estimation of underlying subtype fractions of one of the cell-type in the primary reference.

Usage

hepidish(
  beta.m,
  ref1.m,
  ref2.m,
  h.CT.idx,
  method = c("RPC", "CBS", "CP"),
  maxit = 50,
  nu.v = c(0.25, 0.5, 0.75),
  constraint = c("inequality", "equality")
)

Arguments

beta.m

A data matrix with rows labeling the molecular features (should use same ID as in reference matrices) and columns labeling samples (e.g. primary tumour specimens). Missing value is not allowed and all values should be positive or zero. In the case of DNA methylation, these are beta-values.

ref1.m

A matrix of primary reference 'centroids', i.e. representative molecular profiles, for a number of cell subtypes. rows label molecular features (e.g. CpGs,...) and columns label the cell-type. IDs need to be provided as rownames and colnames, respectively. Missing value is not allowed, and all values in this matrix should be positive or zero. For DNAm data, values should be beta-values.

ref2.m

Similar to ref1.m, but now a A matrix of secondary reference. For example, ref1.m contains reference centroids for epithelial cells, fibroblasts and total immune cells. ref2.m can be subtypes of immune cells, such as B-cells, NK cells, monocytes and etc.

h.CT.idx

A index tells which cell-type in ref1.m is the higher order cell-types in ref2.m. For example, ref1.m contains reference centroids for epithelial cells, fibroblasts and total immune cells. ref2.m contains subtypes of immune cells, the h.CT.idx should be 3, corresponding to immune cells in ref1.m.

method

Chioce of a reference-based method ('RPC','CBS','CP')

maxit

Only used in RPC mode, the limit of the number of IWLS iterations

nu.v

Only used in CBS mode. It is a vector of several candidate nu values. nu is parameter needed for nu-classification, nu-regression, and one-classification in svm. The best estimation results among all candidate nu will be automatically returned.

constraint

Only used in CP mode, you can choose either of 'inequality' or 'equality' normalization constraint. The default is 'inequality' (i.e sum of weights adds to a number less or equal than 1), which was implemented in Houseman et al (2012).

Value

A matrix of the estimated fractions

References

Zheng SC, Webster AP, Dong D, Feber A, Graham DG, Sullivan R, Jevons S, Lovat LB, Beck S, Widschwendter M, Teschendorff AE A novel cell-type deconvolution algorithm reveals substantial contamination by immune cells in saliva, buccal and cervix. Epigenomics (2018) 10: 925-940. doi: 10.2217/epi-2018-0037.

Teschendorff AE, Breeze CE, Zheng SC, Beck S. A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies. BMC Bioinformatics (2017) 18: 105. doi: 10.1186/s12859-017-1511-5.

Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, Wiencke JK, Kelsey KT. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics (2012) 13: 86. doi:10.1186/1471-2105-13-86.

Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, Hoang CD, Diehn M, Alizadeh AA. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods (2015) 12: 453-457. doi:10.1038/nmeth.3337.

Examples

data(centEpiFibIC.m)
data(centBloodSub.m)
data(DummyBeta.m)
frac.m <- hepidish(beta.m = DummyBeta.m, ref1.m = centEpiFibIC.m, 
ref2.m = centBloodSub.m, h.CT.idx = 3, method = 'RPC')



sjczheng/EpiDISH documentation built on Nov. 13, 2022, 9:57 p.m.