Class "MosaicsPeak"

Description

This class represents peak calling results.

Objects from the Class

Objects can be created by calls of the form new("MosaicsPeak", ...).

Slots

peakList:

Object of class "data.frame", representing peak list.

chrID:

Object of class "character", a vector of chromosome IDs.

coord:

Object of class "numeric", a vector of genomic coordinates.

tagCount:

Object of class "numeric", a vector of tag counts of ChIP sample.

mappability:

Object of class "numeric", a vector of mappability score.

gcContent:

Object of class "numeric", a vector of GC content score.

input:

Object of class "numeric", a vector of tag counts of matched control sample.

peakParam:

Object of class "MosaicsPeakParam", representing parameters for peak calling.

bdBin:

Object of class "numeric", representing a vector of bounded bins.

empFDR:

Object of class "numeric", representing empirical FDR.

postProb:

Object of class "numeric", representing posterior probability that a bin belongs to background.

tagLoaded:

Object of class "logical", representing whether read-level data is loaded.

tagData:

Object of class "TagData", representing read-level data.

seqDepth:

Object of class "numeric", a vector of sequencing depth of length 2, where the first and second elements correpond to sequencing depths of ChIP and control samples, respectively. If there is not control sample, the second element is set to NA.

Methods

export

signature(object = "MosaicsPeak"): export peak list into text files.

print

signature(x = "MosaicsPeak"): return peak list in data frame format.

plot

signature( x = "MosaicsPeak", y = "missing", filename=NA, peakNum=NA ): plot ChIP profile in each peak region. If file name is specified in filename, plots are exported to a PDF file named filename. Oterwise, ChIP profiles are plotted to standard output. If users are interested in specific peak regions, users can specify them as a vector in peakNum, where elements indicate which rows peak regions of interest are located. If peakNum is specified, only ChIP profiles in specified peak regions are plotted. Otherwise, ChIP profiles for all the peak regions are plotted.

show

signature(object = "MosaicsPeak"): provide brief summary of the object.

empFDR

signature(object = "MosaicsPeak"): return estimated empirical false discovery rate.

bdBin

signature(object = "MosaicsPeak"): return a vector of bin-level binary indicator of peak calling, where each element is 1 if the corresponding bin belongs to a peak and zero otherwise.

readCoverage

signature(object = "MosaicsPeak"): export a list of coverage matrices of ChIP and matched control samples, where each matrix consists of two columns, genomic coordinates and read count. Each element of this list is a list of length 2, which are matrices for each of ChIP and matched control samples. Elements of this list are sorted to match the rows of peak list. Users can use readCoverage method only after extractReads method is applied to the MosaicsPeak object.

read

signature(object = "MosaicsPeak"): return a list of read-level data of ChIP and matched control samples, where each element is GenomicRanges class. Each element of this list is a list of length 2, which are GenomicRanges objects for each of ChIP and matched control samples. Elements of this list are sorted to match the rows of peak list. Users can use read method only after extractReads method is applied to the MosaicsPeak object.

seqDepth

signature(object = "MosaicsPeak"): return a vector of sequencing depth, where the first and second elements are for ChIP and matched control sample, respectively. If matched control sample is not provided, the second element is NA. Users can use seqDepth method only after extractReads method is applied to the MosaicsPeak object.

postProb

signature( object = "MosaicsPeak", peakRegion=NULL, summaryStat="aveLogP", parallel=FALSE, nCore=8 ): return a data frame of peak-level posterior probabilities of being background (PP) for arbitrary peak regions, where the columns are chromosome ID, peak start position, and peak end position, and summary of PP. Peak regions of interest can be specified in peakRegion argument and a data frame of three columns (chromosome ID, peak start position, and peak end position) is expected. If peakRegion is not provided, a data frame of bin-level PP is provided, where columns are chromosome ID, genomic coordinate, and PP. Summary statistics can be chosen among "aveLogP" (average of -log10 transformed PP; default), "medianLogP" (median of -log10 transformed PP), "sumLogP" (sum of -log10 transformed PP), "logMinP" (-log10 transformation of minimum PP), "logAveP" (-log10 transformation of average PP), and "logMedianP" (-log10 transformation of median PP). Multiple CPUs can be used for parallel computing using "parallel" package if parallel=TRUE, where the number of cores to be used can be specified using nCore.

seqDepth

signature(object = "MosaicsPeak"): provide a vector of sequencing depth.

Author(s)

Dongjun Chung, Pei Fen Kuan, Rene Welch, Sunduz Keles

References

Kuan, PF, D Chung, G Pan, JA Thomson, R Stewart, and S Keles (2011), "A Statistical Framework for the Analysis of ChIP-Seq Data", Journal of the American Statistical Association, Vol. 106, pp. 891-903.

Chung, D, Zhang Q, and Keles S (2014), "MOSAiCS-HMM: A model-based approach for detecting regions of histone modifications from ChIP-seq data", Datta S and Nettleton D (eds.), Statistical Analysis of Next Generation Sequencing Data, Springer.

See Also

mosaicsPeak, mosaicsPeakHMM, export, extractReads, findSummit, adjustBoundary, filterPeak.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
showClass("MosaicsPeak")
## Not run: 
library(mosaicsExample)

# example analysis workflow for ChIP-seq data of transcription factor binding
# (STAT1 factor binding in GM12878 cell line, from ENCODE database)

generateWig( infile=system.file( file.path("extdata","wgEncodeSydhTfbsGm12878Stat1StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
    fileFormat="bam", outfileLoc="./", 
    PET=FALSE, fragLen=200, span=200, capping=0, normConst=1 )

constructBins( infile=system.file( file.path("extdata","wgEncodeSydhTfbsGm12878Stat1StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
    fileFormat="bam", outfileLoc="./", 
    PET=FALSE, fragLen=200, binSize=200, capping=0 )
constructBins( infile=system.file( file.path("extdata","wgEncodeSydhTfbsGm12878InputStdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
    fileFormat="bam", outfileLoc="./", 
    PET=FALSE, fragLen=200, binSize=200, capping=0 )
    
binTFBS <- readBins( type=c("chip","input"),
    fileName=c( "./wgEncodeSydhTfbsGm12878Stat1StdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt",
    "./wgEncodeSydhTfbsGm12878InputStdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt" ) )
binTFBS
head(print(binTFBS))
plot(binTFBS)
plot( binTFBS, plotType="input" )

fitTFBS <- mosaicsFit( binTFBS, analysisType="IO", bgEst="rMOM" )
fitTFBS
plot(fitTFBS)

peakTFBS <- mosaicsPeak( fitTFBS, signalModel="2S", FDR=0.05, 
maxgap=200, minsize=50, thres=10 )
peakTFBS
head(print(peakTFBS))

peakTFBS <- extractReads( peakTFBS,
  chipFile=system.file( file.path("extdata","wgEncodeSydhTfbsGm12878Stat1StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"),
  chipFileFormat="bam", chipPET=FALSE, chipFragLen=200,
  controlFile=system.file( file.path("extdata","wgEncodeSydhTfbsGm12878InputStdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
  controlFileFormat="bam", controlPET=FALSE, controlFragLen=200, parallel=TRUE, nCore=8 )
peakTFBS

peakTFBS <- findSummit( peakTFBS, parallel=FALSE, nCore=8 )

export( peakTFBS, type = "txt", filename = "./peakTFBS.txt" )
export( peakTFBS, type = "bed", filename = "./peakTFBS.bed" )
export( peakTFBS, type = "gff", filename = "./peakTFBS.gff" )
export( peakTFBS, type = "narrowPeak", filename = "./peakTFBS.narrowPeak" )

# example analysis workflow for ChIP-seq data of histone modification
# (H3K4me3 modification in GM12878 cell line, from ENCODE database)

constructBins( infile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878H3k4me3StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
    fileFormat="bam", outfileLoc="./", 
    byChr=FALSE, useChrfile=FALSE, chrfile=NULL, excludeChr=NULL, 
    PET=FALSE, fragLen=200, binSize=200, capping=0 )
constructBins( infile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878ControlStdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
    fileFormat="bam", outfileLoc="./", 
    byChr=FALSE, useChrfile=FALSE, chrfile=NULL, excludeChr=NULL, 
    PET=FALSE, fragLen=200, binSize=200, capping=0 )

binHM <- readBins( type=c("chip","input"),
    fileName=c( "./wgEncodeBroadHistoneGm12878H3k4me3StdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt",
    "./wgEncodeBroadHistoneGm12878ControlStdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt" ) )
binHM
plot(binHM)
plot( binHM, plotType="input" )

fitHM <- mosaicsFit( binHM, analysisType="IO", bgEst="rMOM" )
fitHM
plot(fitHM)

hmmHM <- mosaicsFitHMM( fitHM, signalModel = "2S", 
  init="mosaics", init.FDR = 0.05, parallel=TRUE, nCore=8 )
hmmHM
plot(hmmHM)

peakHM <- mosaicsPeakHMM( hmmHM, FDR = 0.05, decoding="posterior",
  thres=10, parallel=TRUE, nCore=8 )

peakHM <- extractReads( peakHM,
  chipFile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878H3k4me3StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"),
  chipFileFormat="bam", chipPET=FALSE, chipFragLen=200,
  controlFile=system.file( file.path("extdata","wgEncodeBroadHistoneGm12878ControlStdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
  controlFileFormat="bam", controlPET=FALSE, controlFragLen=200, parallel=TRUE, nCore=8 )
peakHM

peakHM <- findSummit( peakHM, parallel=TRUE, nCore=8 )
head(print(peakHM))
plot( peakHM, filename="./peakplot_HM.pdf" )

peakHM <- adjustBoundary( peakHM, parallel=TRUE, nCore=8 )
peakHM
head(print(peakHM))

peakHM <- filterPeak( peakHM, parallel=TRUE, nCore=8 )
peakHM
head(print(peakHM))

export( peakHM, type = "txt", filename = "./peakHM.txt" )
export( peakHM, type = "bed", filename = "./peakHM.bed" )
export( peakHM, type = "gff", filename = "./peakHM.gff" )
export( peakHM, type = "narrowPeak", filename = "./peakHM.narrowPeak" )
export( peakHM, type = "broadPeak", filename = "./peakHM.broadPeak" )

## End(Not run)