Description Usage Arguments Details Value Author(s) Examples
The pipeline for fitting a MBASIC model for sequencing data.
1 2 3 4 |
chipfile |
A string vector for the ChIP files. |
inputfile |
A string vector for the matching input files. The length must be the same as 'chipfile'. |
input.suffix |
A string for the suffix of input files. If |
target |
A GenomicRanges object for the target intervals where the reads are mapped. |
chipformat |
A string specifying the type of the ChIP file. Currently two file types are allowed: 'BAM' or 'BED'. Default: 'BAM'. |
inputformat |
A string specifying the type of the input files. Currently two file types are allowed: 'BAM' or 'BED'. Default: 'BAM'. |
fragLen |
Either a single value or a 2-column matrix of the fragment lengths for the chip and input files. Default: 150. |
pairedEnd |
Either a boolean value or a 2-column boolean matrix for whether each file is a paired-end data set. Currently this function only allows 'BAM' files for paired-end data. Default: FALSE. |
unique |
A boolean value for whether only reads with distinct genomic coordinates or strands are mapped. Default: TRUE. |
m.prefix |
A string for the prefix of the mappability files. |
m.suffix |
A string for the suffix of the mappability files. See details for more information. Default: NULL. |
gc.prefix |
A string for the prefix of the GC files. |
gc.suffix |
A string for the suffix of the GC files. See details for more information. Default: NULL. |
datafile |
The file location to save or load the data matrix. See details. |
J |
The number of clusters to be fitted. |
method |
The fitting algorithm, either "mbasic" (default), which runs an Expectation-Maximization algorithm, or "madbayes", which runs a K-means-like algorithm. |
... |
Parameters for |
This function executes three steps:
The first step uses the generateReadMatrices
function to get the ChIP and Input counts for each locus.
The second step is to compute the covariate matrix. If any of m.prefix
, m.suffix
, gc.prefix
, gc.suffix
is NULL, then the input count matrix is directly used as the covariate matrix for MBASIC. Alternatively, it will use the bkng_mean
to normalize the input count data according to the mappability and GC scores to produce the covariate matrix.
The final step is to call the MBASIC function for model fitting.
Because the first two steps are time consuming, we recommend in specifying a file location for 'datafile
. Then, when this function executes, it first checks whether datafile
exists. If it exists, it will be loaded and the function will jump to the final step. If it does not exist, after the function executes the first two steps, the ChIP data matrix and the covariate matrix will be saved to this file, so that when you rerun this function you do not need to repeat the first two steps.
If method="mbasic"
, the return value by function MBASIC
(if 'J' is scalar) or MBASIC.full
(if 'J' is a vector). If method="madbayes"
, the return value by function MBASIC.MADBayes.full
.
Chandler Zuo zuo@stat.wisc.edu
1 2 3 4 5 6 7 8 | ## Not run:
## This is the example in our vignette
target <- generateSyntheticData(dir = "syntheticData")
tbl <- ChIPInputMatch(dir = paste("syntheticData/", c("chip", "input"), sep = ""),suffix = ".bed", depth = 5)
conds <- paste(tbl$cell, tbl$factor, sep = ".")
MBASIC.fit <- MBASIC.pipeline(chipfile = tbl$chipfile, inputfile = tbl$inputfile, input.suffix = ".bed", target = target, format = "BED", fragLen = 150, pairedEnd = FALSE, unique = TRUE, fac = conds, struct = NULL, S = 2, J = 3, family = "lognormal", maxitr = 10, statemap = NULL)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.