README.md

InfiniumDiffMetMotR

Version: 1.5.5

Description: This is a R package to analyze transcription factor binding motif enrichment at differentially methylated regions for Infinium Methylation BeadChip (Illumina).

Last Update: 2021-03-31

Depends: R (>= 2.10), Biobase (>= 2.5.5)

Author: Takahiro Suzuki

Updated by: takahiro.suzuki.aa@riken.jp

Install

1. Install bioconductor packages

pkgs <- c("BSgenome.Hsapiens.UCSC.hg19", 
   "FDb.InfiniumMethylation.hg19", 
   "lumi", 
   "wateRmelon", 
   "IlluminaHumanMethylationEPICanno.ilm10b2.hg19",
   "IlluminaHumanMethylationEPICmanifest",
   "minfi",
   "methylumi",
   "GEOquery")


if (!requireNamespace("BiocManager", quietly = TRUE))
   install.packages("BiocManager")
BiocManager::install(pkgs, update = F)

2. Install of devtools

devtools needs to install packages from github

install.packages("devtools")

3. Install of InfiniumDiffMetMotR from github

devtools::install_github("takahirosuzuki1980/InfiniumDiffMetMotR")

Examples of usage

TF binding Motif overrepresentation analysis

1. Load of InfiniumDiffMetMotR

library("InfiniumDiffMetMotR")

2. Normalization

- row signal text data (generated by GenomeStudio)

selDataMatrix <- lumiMethyNorm(fileName = "TableControl.txt", inputtype = "signal", sample_names = c("sample1", "sample2"))

- idat files

selDataMatrix <- lumiMethyNorm(idatpath=getwd(), inputtype = "idat", sample_names = c("sample1", "sample2"))

lumiMethyNorm generates Process_Result folder: reports of the normalization, processed_Mval.txt: a matrix of M-value, and sel_processed_Mval.txt: a matrix of M-value which does not inculude low detection p-value probes. (default cut off is 0.01)

- idat files from GEO

selDataMatrix <- getIdat2M(GEOid = GSE100825, version = "EPIC", sampleNames=FALSE)

For a GEO entry, you can directly download idat files and perform normalization and M-value computation by using getIdat2M function only with a GEO accession ID. If multiple version of idat files are registered in a GEO entiry, you cchoose a version(s) to be analyzed. If you choose multiple versions, the out put object is list of M-value data.frames.

3. motif database construction

- Example 1: JASPER_CORE, Hsapiencs and Mmusclus

library("MotifDb")
targetDB <- "JASPAR_CORE"
targetORG <- c("Hsapiens", "Mmusculus")
motifDB <- query(MotifDb, targetDB)        #extraction of motif list of "JASPER_CORE"
motifDB <- c(query(motifDB,targetORG[1]),query(motifDB,targetORG[2]))        #extraction of motifs of "Hsapiens" and "Mmusclus"

If you want analyze a specific motif select a motif. (ex. SPI1)

targetTF <- "SPI1"
motifDB2 <- query(motifDB,targetTF)       #Extraction of motifs for target TF(s)

Finally, convert the motif list to list format.

motifDBList <- as.list(motifDB)

- Example 2: IMAGE motif database

motifDBList <- IMAGE_PWMlist

If you want analyze a specific motif select a motif. (ex. SPI1)

targetTF <- "SPI1"
motifDBList <- IMAGE_PWMlist[grep(targetTF, names(IMAGE_PWMlist))]       #Extraction of motifs for target TF(s)

Motif list should be a list of PWMs of following format;

1 2 3 4 5 6 7 8 9 A 0 0 0 0.1189189 0.1027027 0.2972973 0.28648649 0.10270270 0.04864865 C 0 1 1 0.3837838 0.3081081 0.2378378 0.16216216 0.08648649 0.42162162 G 1 0 0 0.2486486 0.3297297 0.3621622 0.49189189 0.74054054 0.42702703 T 0 0 0 0.2486486 0.2594595 0.1027027 0.05945946 0.07027027 0.10270270

4. Screening of enriched motifs

MotScr(infile=["sel_processed_Mval.txt" or selDataMatrix], motifDBList = [motif list( eg. motifDBList], cutoff = 2, p.cutoff = 0.05, outname="screening_result", ControlColnum=c(1,2), TreatmentColnum=c(3,4), MethylDemethyl="Demethyl", sampling=FALSE, version = "850")

If you perform a comparison of multiple samples, identification of differentially methylated probes uses both Welch's t-test and M-value difference. For single sample comparison, it uses only M-value difference (delta M).

If you want to extract the motif of interest (e.g. SPI1):

motif_names <- "SPI1"
motif <- motifDBList[grep(motif_names,names(motifDBList))]
MotScr(infile="sel_processed_Mval.txt", motifDBList = motif, cutoff = 2, p.cutoff = 0.001, outname="screening_result", ControlColnum=c(1,2), TreatmentColnum=c(3,4), MethylDemethyl="Demethyl", sampling=FALSE, version = "850")

### 5. output files - [outname_DMP_position.txt DMP position - [outname]_mot_analysis_result.txt summary table - [outname]_plot.pdf All histograms, enrichment score plots, fold-change plots, and p-value plots - [outname]_result.RData R data file - [outname]_sig_plots directory/[motif_name].pdf A directory containing enrichment score plots of significantly enriched motifs

Scatter plot

1. Reading of M-value data

infile <- "sel_processed_Mval.txt"
selDataMatrix <- read.table (infile)

2. Labels of the plot

main <- paste(colnames(selDataMatrix)[ControlColnum]," vs. ", colnames(selDataMatrix)[TreatmentColnum], sep="")
xlab <- paste("Control: ", colnames(selDataMatrix)[ControlColnum], sep="")
ylab <- paste("Treatment: ", colnames(selDataMatrix)[TreatmentColnum], sep="")
pdf ("Scatter_Plot.pdf")

3. Plot

scatterPlot(treatment_data=selDataMatrix[,8], control_data=selDataMatrix[,1], main=main, xlab=xlab, ylab=ylab, cutoff=2)
dev.off()


takahirosuzuki1980/InfiniumDiffMetMotR documentation built on March 31, 2021, 8:41 a.m.