EpiMix: The EpiMix function

View source: R/EpiMix.R

EpiMixR Documentation

The EpiMix function

Description

EpiMix uses a model-based approach to identify functional changes DNA methylation that affect gene expression.

Usage

EpiMix(
  methylation.data,
  gene.expression.data,
  sample.info,
  group.1,
  group.2,
  mode = "Regular",
  promoters = FALSE,
  correlation = "negative",
  met.platform = "HM450",
  genome = "hg38",
  cluster = FALSE,
  listOfGenes = NULL,
  filter = TRUE,
  raw.pvalue.threshold = 0.05,
  adjusted.pvalue.threshold = 0.05,
  numFlankingGenes = 20,
  roadmap.epigenome.groups = NULL,
  roadmap.epigenome.ids = NULL,
  chromatin.states = c("EnhA1", "EnhA2", "EnhG1", "EnhG2"),
  NoNormalMode = FALSE,
  cores = 1,
  MixtureModelResults = NULL,
  OutputRoot = "."
)

Arguments

methylation.data

Matrix of the DNA methylation data with CpGs in rows and samples in columns.

gene.expression.data

Matrix of the gene expression data with genes in rows and samples in columns.

sample.info

Dataframe that maps each sample to a study group. Should contain two columns: the first column (named 'primary') indicates the sample names, and the second column (named 'sample.type') indicating which study group each sample belongs to (e.g.,“Cancer” vs. “Normal”, “Experiment” vs. “Control”). Sample names in the 'primary' column must coincide with the column names of the methylation.data.

group.1

Character vector indicating the name(s) for the experiment group.

group.2

Character vector indicating the names(s) for the control group.

mode

Character string indicating the analytic mode to model DNA methylation. Should be one of the followings: 'Regular', 'Enhancer', 'miRNA' or 'lncRNA'. Default: 'Regular'. See details for more information.

promoters

Logic indicating whether to focus the analysis on CpGs associated with promoters (2000 bp upstream and 1000 bp downstream of the transcription start site). This parameter is only used for the Regular mode.

correlation

Character vector indicating the expected correlation between DNA methylation and gene expression. Can be either 'negative' or 'positive'. Default: 'negative'.

met.platform

Character string indicating the microarray type for collecting the DNA methylation data. The value should be either 'HM27', 'HM450' or 'EPIC'. Default: 'HM450'

genome

Character string indicating the genome build version to be used for CpG annotation. Should be either 'hg19' or 'hg38'. Default: 'hg38'.

cluster

Logic indicating whether to cluster CpG site based on methylation levels using hierarchical clustering

listOfGenes

Character vector used for filtering the genes to be evaluated.

filter

Logic indicating whether to use a linear regression filter to pre-filter the CpGs whose methyhlation correlates with gene expression. Used in the Regular mode. Default: TRUE.

raw.pvalue.threshold

Numeric value indicating the threshold of the raw P value for selecting the functional CpG-gene pairs. Default: 0.05.

adjusted.pvalue.threshold

Numeric value indicating the threshold of the adjusted P value for selecting the function CpG-gene pairs. Default: 0.05.

numFlankingGenes

Numeric value indicating the number of flanking genes whose expression is to be evaluated for selecting the functional enhancers. Default: 20.

roadmap.epigenome.groups

(parameter used for the 'Enhancer' mode) Character vector indicating the tissue group(s) to be used for selecting the enhancers. See details for more information. Default: NULL.

roadmap.epigenome.ids

(parameter used for the 'Enhancer' mode) Character vector indicating the epigenome ID(s) to be used for selecting the enhancers. See details for more information. Default: NULL.

chromatin.states

(parameter used for the 'Enhancer' mode) Character vector indicating the chromatin states to be used for selecting the enhancers. To get the available chromatin states, please run the list.chromatin.states() function. Default: c('EnhA1', 'EnhA2', 'EnhG1', 'EnhG2').

NoNormalMode

Logical indicating if the methylation states found in the experiment group should be compared to the control group. Default: FALSE.

cores

Number of CPU cores to be used for computation. Default: 1.

MixtureModelResults

Pre-computed EpiMix results, used for generating functional probe-gene pair matrix. Default: NULL

OutputRoot

File path to store the EpiMix result object. Default: '.' (current directory)

Details

mode: EpiMix incorporates four alternative analytic modes for modeling DNA methylation: “Regular,” “Enhancer”, “miRNA” and “lncRNA”. The four analytic modes target DNA methylation analysis on different genetic elements. The Regular mode aims to model DNA methylation at proximal cis-regulatory elements of protein-coding genes. The Enhancer mode targets DNA methylation analysis on distal enhancers. The miRNA or lncRNA mode focuses on methylation analysis of miRNA- or lncRNA-coding genes.

roadmap.epigenome.groups & roadmap.epigenome.ids:

Since enhancers are cell-type or tissue-type specific, EpiMix needs to know the reference tissues or cell types in order to select the proper enhancers. EpiMix identifies enhancers from the RoadmapEpigenomic project (Nature, PMID: 25693563), which enhancers were identified by ChromHMM in over 100 tissue and cell types. Available epigenome groups (a group of relevant cell types) or epigenome ids (individual cell types) can be obtained from the original publication (Nature, PMID: 25693563, figure 2). They can also be retrieved from the list.epigenomes() function. If both roadmap.epigenome.groups and roadmap.epigenome.ids are specified, EpiMix will select all the epigenomes from the combination of the inputs.

Value

The results from EpiMix is a list with the following components:

MethylationDrivers

CpG probes identified as differentially methylated by EpiMix.

NrComponents

The number of methylation states found for each driver probe.

MixtureStates

A list with the DM-values for each driver probe. Differential Methylation values (DM-values) are defined as the difference between the methylation mean of samples in one mixture component from the experiment group and the methylation mean in samples from the control group, for a given probe.

MethylationStates

Matrix with DM-values for all driver probes (rows) and all samples (columns).

Classifications

Matrix with integers indicating to which mixture component each sample in the experiment group was assigned to, for each probe.

Models

Beta mixture model parameters for each driver probe.

group.1

sample names in group.1 (experimental group).

group.2

sample names in group.2 (control group).

FunctionalPairs

Dataframe with the prevalence of differential methyaltion for each CpG probe in the sample population, and fold change of mRNA expression and P values for each signifcant probe-gene pair.

Examples


data(MET.data)
data(mRNA.data)
data(microRNA.data)
data(lncRNA.data)
data(LUAD.sample.annotation)

# Example #1: Regular mode
EpiMixResults <- EpiMix(methylation.data = MET.data,
                        gene.expression.data = mRNA.data,
                        sample.info = LUAD.sample.annotation,
                        group.1 = 'Cancer',
                        group.2 = 'Normal',
                        met.platform = 'HM450',
                        OutputRoot = tempdir())

# Example #2: Enhancer mode
EpiMixResults <- EpiMix(methylation.data = MET.data,
                       gene.expression.data = mRNA.data,
                       sample.info = LUAD.sample.annotation,
                       mode = 'Enhancer',
                       group.1 = 'Cancer',
                       group.2 = 'Normal',
                       met.platform = 'HM450',
                       roadmap.epigenome.ids = 'E096',
                       OutputRoot = tempdir())

# Example #3: miRNA mode
EpiMixResults <- EpiMix(methylation.data = MET.data,
                       gene.expression.data = microRNA.data,
                       sample.info = LUAD.sample.annotation,
                       mode = 'miRNA',
                       group.1 = 'Cancer',
                       group.2 = 'Normal',
                       met.platform = 'HM450',
                       OutputRoot = tempdir())

# Example #4: lncRNA mode
EpiMixResults <- EpiMix(methylation.data = MET.data,
                       gene.expression.data = lncRNA.data,
                       sample.info = LUAD.sample.annotation,
                       mode = 'lncRNA',
                       group.1 = 'Cancer',
                       group.2 = 'Normal',
                       met.platform = 'HM450',
                       OutputRoot = tempdir())


gevaertlab/EpiMix documentation built on July 20, 2023, 9:28 a.m.