Call broad peaks using fitted MOSAiCS-HMM model

Share:

Description

Call broad peaks using MosaicsHMM class object, which is a fitted MOSAiCS-HMM model.

Usage

1
2
3
4
5
mosaicsPeakHMM( object, ... )
## S4 method for signature 'MosaicsHMM'
mosaicsPeakHMM( object, FDR=0.05, decoding="posterior",
    binsize=NA, maxgap=0, minsize=0, thres=0, 
    parallel=FALSE, nCore=8 )

Arguments

object

Object of class MosaicsHMM, a fitted MOSAiCS model obtained using function mosaicsFitHMM.

FDR

False discovery rate. Default is 0.05. Not relevant when decoding="viterbi".

decoding

Approach to determine the undelying state. Possible values are "viterbi" (Viterbi algorithm) and "posterior" (posterior decoding). Default is "posterior".

binsize

Size of each bin. Value should be positive integer. If binsize=NA, mosaicsPeakHMM function calcuates the value from data. Default is NA.

maxgap

Initial nearby peaks are merged if the distance (in bp) between them is less than maxgap. Default is 0.

minsize

An initial peak is removed if its width is narrower than minsize. Default is 0.

thres

A bin within initial peak is removed if its ChIP tag counts are less than thres. Default is 0.

parallel

Utilize multiple CPUs for parallel computing using "parallel" package? Possible values are TRUE (utilize multiple CPUs) or FALSE (do not utilize multiple CPUs). Default is FALSE (do not utilize multiple CPUs).

nCore

Number of CPUs when parallel computing is utilized.

...

Other parameters to be passed through to generic mosaicsHMM.

Details

mosaicsFitHMM and mosaicsPeakHMM are developed to identify broad peaks such as histone modifications, using Hidden Markov Model (HMM) approach, as proposed in Chung et al. (2014). If you are interested in identifying narrow peaks such as transcription factor binding sites, please use mosaicsPeak instead of mosaicsFitHMM and mosaicsPeakHMM.

maxgap, minsize, and thres are for refining initial peaks called using specified decoding (and FDR if decoding="posterior"). If you use a bin size shorter than the average fragment length of the experiment, we recommend to set maxgap to the average fragment length and minsize to the bin size. If you set the bin size to the average fragment length or if bin size is larger than the average fragment length, set maxgap to the average fragment length and minsize to a value smaller than the average fragment length. See the vignette for further details.

Parallel computing can be utilized for faster computing if parallel=TRUE and parallel package is loaded. nCore determines number of CPUs used for parallel computing.

Value

Construct MosaicsPeak class object.

Author(s)

Dongjun Chung, Pei Fen Kuan, Rene Welch, Sunduz Keles

References

Kuan, PF, D Chung, G Pan, JA Thomson, R Stewart, and S Keles (2011), "A Statistical Framework for the Analysis of ChIP-Seq Data", Journal of the American Statistical Association, Vol. 106, pp. 891-903.

Chung, D, Zhang Q, and Keles S (2014), "MOSAiCS-HMM: A model-based approach for detecting regions of histone modifications from ChIP-seq data", Datta S and Nettleton D (eds.), Statistical Analysis of Next Generation Sequencing Data, Springer.

See Also

mosaicsFitHMM, extractReads, findSummit, adjustBoundary, filterPeak, MosaicsHMM, MosaicsPeak.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
## Not run: 
library(mosaicsExample)

constructBins( infile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878H3k4me3StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
    fileFormat="bam", outfileLoc="./", 
    byChr=FALSE, useChrfile=FALSE, chrfile=NULL, excludeChr=NULL, 
    PET=FALSE, fragLen=200, binSize=200, capping=0 )
constructBins( infile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878ControlStdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
    fileFormat="bam", outfileLoc="./", 
    byChr=FALSE, useChrfile=FALSE, chrfile=NULL, excludeChr=NULL, 
    PET=FALSE, fragLen=200, binSize=200, capping=0 )

binHM <- readBins( type=c("chip","input"),
    fileName=c( "./wgEncodeBroadHistoneGm12878H3k4me3StdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt",
    "./wgEncodeBroadHistoneGm12878ControlStdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt" ) )
binHM
plot(binHM)
plot( binHM, plotType="input" )

fitHM <- mosaicsFit( binHM, analysisType="IO", bgEst="rMOM" )
fitHM
plot(fitHM)

hmmHM <- mosaicsFitHMM( fitHM, signalModel = "2S", 
  init="mosaics", init.FDR = 0.05, parallel=TRUE, nCore=8 )
hmmHM
plot(hmmHM)

peakHM <- mosaicsPeakHMM( hmmHM, FDR = 0.05, decoding="posterior",
  thres=10, parallel=TRUE, nCore=8 )

peakHM <- extractReads( peakHM,
  chipFile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878H3k4me3StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"),
  chipFileFormat="bam", chipPET=FALSE, chipFragLen=200,
  controlFile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878ControlStdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
  controlFileFormat="bam", controlPET=FALSE, controlFragLen=200, parallel=TRUE, nCore=8 )
peakHM

peakHM <- findSummit( peakHM, parallel=TRUE, nCore=8 )
head(print(peakHM))
plot( peakHM, filename="./peakplot_HM.pdf" )

peakHM <- adjustBoundary( peakHM, parallel=TRUE, nCore=8 )
peakHM
head(print(peakHM))

peakHM <- filterPeak( peakHM, parallel=TRUE, nCore=8 )
peakHM
head(print(peakHM))

export( peakHM, type = "txt", filename = "./peakHM.txt" )
export( peakHM, type = "bed", filename = "./peakHM.bed" )
export( peakHM, type = "gff", filename = "./peakHM.gff" )
export( peakHM, type = "narrowPeak", filename = "./peakHM.narrowPeak" )
export( peakHM, type = "broadPeak", filename = "./peakHM.broadPeak" )

## End(Not run)