peakDetection: Peak calling method, i. e. identification of genomic regions...

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/peakDetection.R

Description

bPeaks uses a sliding window to scan the genomic sequence. Four criterion define interesting regions: 1) a high number of reads in the IP sample (T1 = IPthreshold); 2) a low number of reads in the control sample (T2 = controlThreshold); 3) a high value of log(IP/control) (T3 = ratioThreshold) and 4) a good sequencing coverage (IP and control samples) (T4 = averageThreshold). Parameters for peak detection, therefore, rely on four threshold values (T1, T2, T3 and T4). Graphical outputs and BED files are provided allowing the user to rapidly assess the relevance of the chosen parameters.

Usage

1
2
3
4
5
peakDetection(IPdata, controlData, chrName, windowSize = 150, windowOverlap = 50,
              outputName = "bPeaks_results", 
              baseLineIP = NULL, baseLineControl = NULL, 
              IPthreshold = 6, controlThreshold = 4, ratioThreshold = 2,
              averageThreshold = 0.7, peakDrawing = TRUE)

Arguments

IPdata

A dataframe with sequencing results of the IP sample. This dataframe has three columns (chromosome, position, number of sequences) and should have been created with the dataReading function

controlData

A dataframe with sequencing results of the control sample. This dataframe has three columns (chromosome, position, number of sequences) and should have been created with the dataReading function

chrName

Name of the chromosome to be scanned with the sliding windows (to compare IP and control signals and detect interesting regions)

windowSize

Size of the sliding window to scan chromosomes

windowOverlap

Size of the overlap between two successive windows

outputName

Name for output files created during bPeaks procedure

baseLineIP

Value of the mean genome-wide read depth (IP sample). This value is calculated using the baseLineCalc function

baseLineControl

Value of the mean genome-wide read depth (control sample). This value is calculated using the baseLineCalc function

IPthreshold

Threshold T1. Threshold to consider IP signal as sufficiently important to be interesting. Note that these threshold is a multiplicative parameter that will be combined with the calculated "baseLineIP" (see before) value. As an illustration, if the IPthreshold = 6, it means that to be selected, the IP signal should be GREATER than 6 * baseLineIP

controlThreshold

Threshold T2. Threshold to consider control signal as sufficiently low to be interesting. Note that these threshold is a multiplicative parameter that will be combined with the calculated "baseLineControl" (see before) value. As an illustration, if the controlThreshold = 2, it means that to be selected, the control signal should remain LOWER than 2 * baseLineControl

ratioThreshold

Threshold T3. Threshold to consider log2(IP/control) values as sufficiently important to be interesting

averageThreshold

Threshold T4. Threshold to consider (log2(IP) + log2(control)) / 2 as sufficiently important to be interesting. These parameter is important to ensure that the analyzed genomic region has enough sequencing coverage to be reliable. These threshold should be between [0, 1] and refers to the quantile value of the global distribution observed with the analyzed chromosome

peakDrawing

TRUE or FLASE. If TRUE, the function peakDrawing is called and PDF files with graphical representations of detected peaks are created.

Details

Detailed description of the bPeaks procedure together with tutorials can be found online:

http://bpeaks.gene-networks.net/.

Value

A matrix with genomic positions of the detected peaks. Summaries of parameter calculations and peak detection criteria are shown in PDF files (saved in the working directory).

Note

Detailed information and tutorials can be found online http://bpeaks.gene-networks.net/. Don't hesitate to contact us for further discussions.

Author(s)

Gaelle LELANDAIS

References

http://bpeaks.gene-networks.net/

See Also

dataReading baseLineCalc peakDrawing bPeaksAnalysis

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# get library
library(bPeaks)

# get PDR1 data
data(dataPDR1)

# combine IP and control data
allData = cbind(dataPDR1$IPdata, dataPDR1$controlData)
colnames(allData) = c("chr", "pos", "IPsignal", "chr", "pos", "controlSignal")

print("**********************************************")
# calculate baseline IP and control values
lineIP    = baseLineCalc(allData$IPsignal)
print(paste("Baseline coverage value in IP sample : ", round(lineIP, 3)))
lineControl = baseLineCalc(allData$controlSignal)
print(paste("Baseline coverage value in control sample : ", round(lineControl, 3)))
print("**********************************************")
print("")

# get list of chromosomes
chromNames = unique(allData[,1])

# start peak detection on the first chromosome
print("**********************************************")
print(paste("Starting analysis of chromosome ", chromNames[1]))

# information for one chromosome
subData = allData[allData[,1] == chromNames[1],]

# only 10 kb are analyzed here (as an illustration)
vecIP      = subData[40000:50000,3]
vecControl = subData[40000:50000,6]

# smooth of the data
smoothedIP    = dataSmoothing(vecData = vecIP, widthValue = 20) 
smoothedControl = dataSmoothing(vecData = vecControl, widthValue = 20) 

# peak detection
detectedPeaks = peakDetection(IPdata = smoothedIP, controlData = smoothedControl, 
                chrName = as.character(chromNames[1]), 
                windowSize = 150, windowOverlap = 50,
                outputName = paste("bPeaks_example_", chromNames[1], sep = ""), 
                baseLineIP = lineIP, baseLineControl = lineControl,
                IPthreshold = 4, controlThreshold = 2,
                ratioThreshold = 1, averageThreshold = 0.5,
                peakDrawing = TRUE)

# print detected genomic positions
print(detectedPeaks)

bPeaks documentation built on May 1, 2019, 10:14 p.m.