Home

/

GitHub

/

README.md
In gabrielodom/DMRcomparePaper: Comparing Different Analytical Tools for Identifying Differentially Methylated Regions (DMRs) in Illumina Methylation Arrays

DMRcompare

This project contains analysis scripts used in the manuscript "An Evaluation of Supervised Methods for Identifying Differentially Methylated Regions in Illumina Methylation Arrays" by Mallik et al. The reference files for all functions in this project is in /docs/DMRcompare.pdf.

File: /inst/1_downloader2.5.R
Input: GEO accession number, criteria for selecting methylation datasets.
Output: betaVals_mat, which is beta value matrix for selected methylation samples. This file has rows = cpg ids, columns = sample ids. An example file is at /data/betaVals_mat.csv.

The A-clustering algorithm described in Sofer et al. (2011) was used to identify clusters of adjacent CpGs.

File: /inst/1_Aclust_data_import.R
Input:
1. betaVals_mat: a beta value matrix of all CpGs on the array
2. cpgLocation_df: an annotation file that indicates locations of CpGs. This file has rows = cpg ids, columns = chromosome, location. An example file is at /data/cpgLocation_df.csv.
Output: startEndCPG_df, which is beta value matrix for clusters of CpGs. This file has rows = cpg ids, columns = cluster number, chr, start of cluster, end of cluster, sample ids. An example file is at /data/startEndCpG_df.csv.

There are three main steps in the simulation study. See /docs/DMRcompare.pdf for details of each function.

Simulate differentially methylated clusters of CpGs.
- File: SimulateData() in script file R/2_simulatedata.R
- Main Input: betaVals_mat, startEndCpG_df (file that indicates clusters of CpGs), treatment effects to be added to the clusters
- Main Output: simulated beta value matrix, where treatment effects were added to 500 randomly selected clusters of CpGs
Apply DMR finding methods to the simulated datasets:
- Files:
  - RunBumphunter() in script file R/3_RunBumphunter.R
  - RunDMRcate() in script file R/3_RunDMRcate.R
  - RunProbeLasso() in script file R/3_RunProbeLasso.R
  - The Comb-p method was implemented in Python. The corresponding shell script is exec/run_combp_working1.sh
- Main output: significant DMRs identified by each of the methods. These functions are called by three wrapper functions:
  - WriteBumphunterResults() in script file R/4_simulate_and_save_Bumphunter_results.R
  - WriteDMRcateResults() in script file R/4_simulate_and_save_DMRcate_results.R
  - WriteProbeLassoResults() in script file R/4_simulate_and_save_ProbeLasso_results.R
Summarize results of DMR finding methods:
- Files:
  - ProcessBumphunterResults() in script file R/5_read_and_summarize_Bumphunter_results.R
  - ProcessDMRcateResults() in script file R/5_read_and_summarize_DMRcate_results.R
  - ProcessProbeLassoResults() in script file R/5_read_and_summarize_ProbeLasso_results.R
  - ProcessCombpResults() in script file R/5_standardize_and_summarize_Comb-p_results.R
- Main output: These functions compare the significant DMRs identified by each method, evaluate whether they overlap with the true positive clusters where treatment effects were added, and then compute summary statistics including TP, FP, TN, FN, power, precision, median number of CpGs in significant DMRs

True Positives (TP), False Positives (FP), False Negatives (FN), Power, Precision, Area under Precision-Recall curve (AuPR), Matthews' correlation coefficient (MCC), F1 Scores (F1) and Elapsed Time (in seconds) for the different DMR detection tools based on simulation datasets:

docs/Method_compare_report_20180705.Rmd

docs/Method_compare_graphs_20180709.Rmd

BuildPRcurve() in script file R/6_Build_Precision-Recall_Curve_List.R
PlotPRCurve() in script file R/6_Plot_Precision-Recall_Curves.R

docs/Method_compare_graphs_20180709.Rmd

PlotOverlaps() in script file R/6_Plot_DMR-Overlaps_Venn.R

gabrielodom/DMRcomparePaper documentation built on May 25, 2019, 2:52 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

gabrielodom/DMRcomparePaper
Comparing Different Analytical Tools for Identifying Differentially Methylated Regions (DMRs) in Illumina Methylation Arrays

README.md
In gabrielodom/DMRcomparePaper: Comparing Different Analytical Tools for Identifying Differentially Methylated Regions (DMRs) in Illumina Methylation Arrays

DMRcompare

I. Download Public Methylation Dataset GSE41169

II. Perform Adjacent Site Clustering to Obtain Clusters of Adjacent CpG Probes (A-clusters)

III. Simulation Study

IV. Table of Results

V. Plots of Precision and Power over Different Effect Sizes

VI. Plot of Precision and Recall Curves

VII. Plot of DMR Sizes

VIII. Venn Diagrams for Overlap of DMRs Identified by Each Method

R Package Documentation

Browse R Packages

We want your feedback!

gabrielodom/DMRcomparePaper Comparing Different Analytical Tools for Identifying Differentially Methylated Regions (DMRs) in Illumina Methylation Arrays

README.md In gabrielodom/DMRcomparePaper: Comparing Different Analytical Tools for Identifying Differentially Methylated Regions (DMRs) in Illumina Methylation Arrays

DMRcompare

I. Download Public Methylation Dataset GSE41169

II. Perform Adjacent Site Clustering to Obtain Clusters of Adjacent CpG Probes (A-clusters)

III. Simulation Study

IV. Table of Results

V. Plots of Precision and Power over Different Effect Sizes

VI. Plot of Precision and Recall Curves

VII. Plot of DMR Sizes

VIII. Venn Diagrams for Overlap of DMRs Identified by Each Method

R Package Documentation

Browse R Packages

We want your feedback!

gabrielodom/DMRcomparePaper
Comparing Different Analytical Tools for Identifying Differentially Methylated Regions (DMRs) in Illumina Methylation Arrays

README.md
In gabrielodom/DMRcomparePaper: Comparing Different Analytical Tools for Identifying Differentially Methylated Regions (DMRs) in Illumina Methylation Arrays