readBins: Import bin-level ChIP-sep data
In dongjunchung/mosaics: MOSAiCS (MOdel-based one and two Sample Analysis and Inference for ChIP-Seq)

Description Usage Arguments Details Value Author(s) References See Also Examples

Import and preprocess all or subset of bin-level ChIP-sep data, including ChIP data, matched control data, mappability score, GC content score, and sequence ambiguity score.

1 2	readBins( type = c("chip", "input"), fileName = NULL, dataType = "unique", rounding = 100, parallel=FALSE, nCore=8 )

`type`	Character vector indicating data types to be imported. This vector can contain `"chip"` (ChIP data), `"input"` (matched control data), `"M"` (mappability score), `"GC"` (GC content score), and `"N"` (sequence ambiguity score). Currently, `readBins` permits only the following combinations: `c("chip", "input")`, `c("chip", "input", "N")`, `c("chip", "input", "M", "GC", "N")`, and `c("chip", "M", "GC", "N")`. Default is `c("chip", "input")`.
`fileName`	Character vector of file names, each of which matches each element of `type`. `type` and `fileName` should have the same length and corresponding elements in two vectors should appear in the same order.
`dataType`	How reads were processed? Possible values are either `"unique"` (only uniquely aligned reads were retained) or `"multi"` (reads aligned to multiple locations were also retained).
`rounding`	How are mappability score and GC content score rounded? Default is 100 and this indicates rounding of mappability score and GC content score to the nearest hundredth.
`parallel`	Utilize multiple CPUs for parallel computing using `"paralle"` package? Possible values are `TRUE` (use multiple CPUs) or `FALSE` (do not use multiple CPUs). Default is `FALSE` (do not use multiple CPUs).
`nCore`	Number of CPUs when parallel computing is utilized.

Bin-level ChIP and matched control data can be generated from the aligned read files for your samples using the method constructBins. In mosaics package companion website, http://www.stat.wisc.edu/~keles/Software/mosaics/, we provide preprocessed mappability score, GC content score, and sequence ambiguity score files for diverse reference genomes. Please check the website and the vignette for further details.

The imported data type constraints the analysis that can be implemented. If type=c("chip", "input") or c("chip", "input", "N"), only two-sample analysis without using mappability and GC content is allowed. For type=c("chip", "input", "M", "GC", "N"), user can do the one- or two-sample analysis. If type=c("chip", "M", "GC", "N"), only one-sample analysis is permitted. See help page of mosaicsFit.

When the data contains multiple chromosomes, parallel computing can be utilized for faster preprocessing if parallel=TRUE and parallel package is loaded. nCore determines number of CPUs used for parallel computing.

Construct BinData class object.

Dongjun Chung, Pei Fen Kuan, Rene Welch, Sunduz Keles

Kuan, PF, D Chung, G Pan, JA Thomson, R Stewart, and S Keles (2011), "A Statistical Framework for the Analysis of ChIP-Seq Data", Journal of the American Statistical Association, Vol. 106, pp. 891-903.

Chung, D, Zhang Q, and Keles S (2014), "MOSAiCS-HMM: A model-based approach for detecting regions of histone modifications from ChIP-seq data", Datta S and Nettleton D (eds.), Statistical Analysis of Next Generation Sequencing Data, Springer.

constructBins, mosaicsFit, BinData.

## Not run: 
library(mosaicsExample)

constructBins( infile=system.file( file.path("extdata","wgEncodeSydhTfbsGm12878Stat1StdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
    fileFormat="bam", outfileLoc="~/", 
    PET=FALSE, fragLen=200, binSize=200, capping=0 )
constructBins( infile=system.file( file.path("extdata","wgEncodeSydhTfbsGm12878InputStdAlnRep1_chr22_sorted.bam"), package="mosaicsExample"), 
    fileFormat="bam", outfileLoc="~/", 
    PET=FALSE, fragLen=200, binSize=200, capping=0 )
    
binTFBS <- readBins( type=c("chip","input"),
    fileName=c( "~/wgEncodeSydhTfbsGm12878Stat1StdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt",
    "~/wgEncodeSydhTfbsGm12878InputStdAlnRep1_chr22_sorted.bam_fragL200_bin200.txt" ) )

## End(Not run)