For a detailed manual for this section please access these links:
Expected input is a CSV file with the following pattern:
The possible thresholds controls are:
The options Mean DNA methylation difference threshold and Log FC threshold are used to circle genes which pass the cut-offs previously defined (eg. mean methylation or FDR).
The possible highlight controls are:
This sub-menu will help the user to perform an integrative analysis of DNA methylation and Gene expression using the R/Bioconductor ELMER package [@yao2015inferring].
Recently, many studies suggest that enhancers play a major role as regulators of cell-specific phenotypes leading to alteration in transcriptomes related to diseases [@giorgio2015large; @groschel2014single; @sur2012mice; @yao2015demystifying]. In order to investigate regulatory enhancers that can be located at long distances upstream or downstream of target genes Bioconductor offer the Enhancer Linking by Methylation/Expression Relationship (ELMER) package. This package is designed to combine DNA methylation and gene expression data from human tissues to infer multi-level cis-regulatory networks. It uses DNA methylation to identify enhancers and correlates their state with the expression of nearby genes to identify one or more transcriptional targets. Transcription factor (TF) binding site analysis of enhancers is coupled with expression analysis of all TFs to infer upstream regulators. This package can be easily applied to TCGA public available cancer datasets and custom DNA methylation and gene expression datasets [@yao2015inferring].
ELMER analysis have 5 main steps:
Identify distal probes on HM450K or EPIC.
Identification of distal probes with significant differential DNA methylation (i.e. DMCs) in tumor vs. normal samples.
Identification of putative target gene(s) for differentially methylated distal probes.
Identification of enriched motifs within a set of probes in significant probe-gene pairs.
Identification of master regulator Transcription Factors (TF) for each enriched motif.
The ELMER input is a mee object that contains a DNA methylation matrix, a gene expression matrix, a probe information GRanges, the gene information GRanges and a data frame summarizing the data. It should be highlighted that samples without both the gene expression and DNA methylation data will be removed from the mee object.
To perform ELMER analyses, we need to populate a MultiAssayExperiment with a DNA methylation matrix or SummarizedExperiment object from HM450K or EPIC platform; a gene expression matrix or SummarizedExperiment object for the same samples; a matrix mapping DNA methylation samples to gene expression samples; and a matrix with sample metadata (i.e. clinical data, molecular subtype, etc.). If TCGA data are used, the last two matrices will be automatically generated. If using non-TCGA data, the matrix with sample metadata should be provided with at least a column with a patient identifier and another one identifying its group which will be used for analysis, if samples in the methylation and expression matrices are not ordered and with same names, a matrix mapping for each patient identifier their DNA methylation samples and their gene expression samples should be provided to the createMAE function. Based on the genome of reference selected, metadata for the DNA methylation probes, such as genomic coordinates, will be added from http://zwdzwd.github.io/InfiniumAnnotation [@zhou2016comprehensive]; and metadata for gene expression and annotation is added from Ensembl database [@yates2015ensembl] using biomaRt [@durinck2009mapping].
The steps required to create this object and perform ELMER analysis is described in this tutorial.
After preparing the data into a MAE object, we will execute the five ELMER steps for the hypo (distal probes hypomethylated in the group2) direction.
This box has all the available options for ELMER functions. Please see the ELMER vignette.
Also, a description of the data used by ELMER (such as the distal enhancer probes) is found in the ELMER.data vignette.
ELMER Identifies the enriched motifs for the distal probes which are significantly differentially methylated and linked to a putative target gene, it will plot the Odds Ratio (x axis) for each motif found. These motifs by default have a minimum incidence of 10 probes (that means at least 10 probes were associated with the motif) in the given probes set and the smallest lower boundary of 95% confidence interval for Odds Ratio of 1.1.
After finding the enriched motifs, ELMER identifies regulatory transcription factors (TFs) whose expression is associated with DNA methylation at motifs. ELMER automatically creates a TF ranking plot for each enriched motif. This plot shows the TF ranking plots based on the association score $(-log(P value))$ between TF expression and DNA methylation of the motif. We can see in Figure below that the top three TFs that are associated with a motif found.
In case, it identifies regulatory transcription factors (TFs), an object with the prefix ELMER_results will be created with the necessary data to visualize the results.
Select the R object (rda) file with ELMER results created in the analysis step (the one with prefix ELMER_results)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.