MBASIC.pipeline: The pipeline for fitting a MBASIC model for sequencing data.

Description Usage Arguments Details Value Author(s) Examples

Description

The pipeline for fitting a MBASIC model for sequencing data.

Usage

1
2
3
4
MBASIC.pipeline(chipfile, inputfile, input.suffix, target, chipformat,
  inputformat, fragLen, pairedEnd, unique, m.prefix = NULL, m.suffix = NULL,
  gc.prefix = NULL, gc.suffix = NULL, datafile = NULL, ncores = 10, J,
  method = "mbasic", ...)

Arguments

chipfile

A string vector for the ChIP files.

inputfile

A string vector for the matching input files. The length must be the same as 'chipfile'.

input.suffix

A string for the suffix of input files. If NULL, inputfile will be treated as the full names of the input files. Otherwise, all inputfiles with the initial inputfile and this suffix will be merged.

target

A GenomicRanges object for the target intervals where the reads are mapped.

chipformat

A string specifying the type of the ChIP file. Currently two file types are allowed: 'BAM' or 'BED'. Default: 'BAM'.

inputformat

A string specifying the type of the input files. Currently two file types are allowed: 'BAM' or 'BED'. Default: 'BAM'.

fragLen

Either a single value or a 2-column matrix of the fragment lengths for the chip and input files. Default: 150.

pairedEnd

Either a boolean value or a 2-column boolean matrix for whether each file is a paired-end data set. Currently this function only allows 'BAM' files for paired-end data. Default: FALSE.

unique

A boolean value for whether only reads with distinct genomic coordinates or strands are mapped. Default: TRUE.

m.prefix

A string for the prefix of the mappability files.

m.suffix

A string for the suffix of the mappability files. See details for more information. Default: NULL.

gc.prefix

A string for the prefix of the GC files.

gc.suffix

A string for the suffix of the GC files. See details for more information. Default: NULL.

datafile

The file location to save or load the data matrix. See details.

J

The number of clusters to be fitted.

method

The fitting algorithm, either "mbasic" (default), which runs an Expectation-Maximization algorithm, or "madbayes", which runs a K-means-like algorithm.

...

Parameters for MBASIC (if method is "mbasic") or MBASIC.MADBayes.full.

Details

This function executes three steps:
The first step uses the generateReadMatrices function to get the ChIP and Input counts for each locus.
The second step is to compute the covariate matrix. If any of m.prefix, m.suffix, gc.prefix, gc.suffix is NULL, then the input count matrix is directly used as the covariate matrix for MBASIC. Alternatively, it will use the bkng_mean to normalize the input count data according to the mappability and GC scores to produce the covariate matrix.
The final step is to call the MBASIC function for model fitting.
Because the first two steps are time consuming, we recommend in specifying a file location for 'datafile. Then, when this function executes, it first checks whether datafile exists. If it exists, it will be loaded and the function will jump to the final step. If it does not exist, after the function executes the first two steps, the ChIP data matrix and the covariate matrix will be saved to this file, so that when you rerun this function you do not need to repeat the first two steps.

Value

If method="mbasic", the return value by function MBASIC (if 'J' is scalar) or MBASIC.full (if 'J' is a vector). If method="madbayes", the return value by function MBASIC.MADBayes.full.

Author(s)

Chandler Zuo zuo@stat.wisc.edu

Examples

1
2
3
4
5
6
7
8
## Not run: 
## This is the example in our vignette
target <- generateSyntheticData(dir = "syntheticData")
tbl <- ChIPInputMatch(dir = paste("syntheticData/", c("chip", "input"), sep = ""),suffix = ".bed", depth = 5)
conds <- paste(tbl$cell, tbl$factor, sep = ".")
MBASIC.fit <- MBASIC.pipeline(chipfile = tbl$chipfile, inputfile = tbl$inputfile, input.suffix = ".bed", target = target, format = "BED", fragLen = 150, pairedEnd = FALSE, unique = TRUE, fac = conds, struct = NULL, S = 2, J = 3, family = "lognormal", maxitr = 10, statemap = NULL)

## End(Not run)

chandlerzuo/mbasic documentation built on May 13, 2019, 3:24 p.m.