sizeFac: Estimate Size Factors of Samples

Description Usage Arguments Value Examples

Description

This function estimates sample size factors for normalization purpose in downsteam analysis. Size factors of sample pairs are estimated firstly by comparing samples to one reference sample (i.e. the sample corresponding to first column of count matrix). Then, size factors are combined across all samples with the median size factor as 1. In detail, binding type is first estimated using the same strategy as function chipType for each sample pair. When biomodel, size factor is calcualted based on a decision on which kernale density mode to be used for scaling.

Usage

1
2
sizeFac(count, cutoff = 50L, fold = 10, h = 0.1, plot = FALSE,
  sanity = FALSE, cond = NULL)

Arguments

count

A matrix of read counts or a SummarizedExperiment, where columns are samples and rows are peaks or high coverage bins. This object can be generated by function regionReads.

cutoff

A numeric cut off on count matrix. If positive, only peaks/bins with counts larger than cutoff in at least one sample are used to estimate the size factors. We recommend a larger cutoff since background signal can dramatically mask the right estimation of kernal density, especially for deep sequenced ChIP-seq samples. (Default: 50)

fold

A numeric threshold to help determining the binding type. In detail, if top 2 estimated modes on smoothed kernal density have a height differece less than the folds given by fold, binding type will be determined as bimodel; otherwise, it is unimodel. This number should be larger than 1. (Default: 10)

h

Initial smoothing factor when estimating kernal density for bump hunting. (Default: 0.1)

plot

A logical indicator that if M-A plot and smoothed kernal density should be visualized. (Default: FALSE)

sanity

A logical indicator if checking sanity across replicates in the same conditions. A negative report of sanity check indicates either a bad experiment (e.g. binding type is not consistent across replicates) or a bad initiation of function parameters (e.g. cutoff and fold are not pre-estimated well). However, a negative report of sanity check doesn't neccessarily mean a bad estimation of size factors, as the strategy of hunting kernal density mode is robust to find the right scaling regardless of unimodel or bimodel. (Default: FALSE)

cond

NULL or a two-level factor specifying biological conditions for samples in count (e.g. control & treatment). This parameter is different from the meta information in count when count is a SummarizedExperiment. It only includes information of conditions, and should be a factor object. This parameter is only appliable when sanity is TRUE. (Default: NULL)

Value

A list with the following conponents:

sizefac

A numeric vector indicating estimated size factors of samples

type

A character vector with value either "bimodel" or "unimodel", indicating the binding types by comparing ro the sample "control"

Examples

1
2
3
4
5
6
## load sample data
data(complex)
names(complex)

## test sample data
sizeFac(count=complex$counts)

tengmx/ComplexDiff documentation built on May 31, 2019, 8:34 a.m.