diffRegions: Differential Binding Estimation for Protein Complexes

Description Usage Arguments Details Value Examples

Description

This is the main function of this package. It accepts outputs from other functions in this package, and integrates statistical methods of signal smoothing, bump hunting and differential testing, and reports differential binding regions with estimated significance. It is important that the inputs are genome-wide bin level read counts, instead of peak level. Also, it is noted that each read should be only assigned to one bin if multiple overlapping exists, as done by function regionReads.

Usage

1
2
diffRegions(count, bins = NULL, meta = NULL, design, sizefac, rccut = 15,
  fccut = 0.4, gap = 2, diffmeth = c("DESeq2", "limma", "ttest"))

Arguments

count

A matrix of read counts or a RangedSummarizedExperiment, where columns are samples and rows are genome-wide bins. This object can be generated by function regionReads.

bins

If count is a read count matrix, bins should be provided as a GRanges object recording bins of corresponding rows in count. If count is a RangedSummarizedExperiment, this parameter will be ignored.

meta

If count is a read count matrix, this should be a DataFrame object recording sample annotations. Rows of meta correspond to the columns of count. The design parameterB treats the column names of meta as variables. If count is a RangedSummarizedExperiment, this parameter will be ignored.

design

A formula object which expresses how read counts for each bin depend on the variables in meta, e.g. '~ group + condition' etc. Or, if count is a RangedSummarizedExperiment, the bins and meta objects will be extracted by rowRanges() and colData() from count object. By default, the last variable in design formula will be used to build the differential binding contrast. At most two variables are allowed if diffmeth is set to 'ttest'. (Details see below)

sizefac

A numeric vector indicating estimated size of samples for normalization purpose. This vector can be generated by function sizeFac.

rccut

A numeric cutoff on normalized count matrix using sizefac. If positive, only bins with normalized counts larger than rccut in at least one sample are selected for fold change estimate. Unlike other functions in this package, moderate cutoff would be better as too large results more false negative and too small increases the time cost. (Default: 15)

fccut

A numeric cutoff on smoothed log2foldchanges of bins for bump hunting of differtial binding regions. Neighbor bins with fold change larger than this value will be merged together with allowed gaps. (Default: 0.4)

gap

A integer specifying the gaps allowed for bin merging, in the unit of number of bins. (Default: 2)

diffmeth

Method for statistical testing of differential binding. (Default: 'DESeq2')

Details

Three methods are provided for significance estimation of differential binding. DESeq2 allows pseudo-estimation for comparisons without replicates; otherwise, all methods can be used for comparisons with at least two replicates. The design formula can be specified as suggested by DESeq2 and limma if these two methods are selected. For ttest, design can either contain one or two components, referring to student's t-test or paired t-test based on logarithm scaled data. For consistance with other packages, the last component in design formula is the contrast on which the final differential estimation are reported.

Value

A GRanges object containing potential regions with differential binding, as well as statistical significances as meta columns.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
## load sample data
data(complex)
names(complex)

## test sample data
sizefac <- sizeFac(count=complex$counts,plot=TRUE)$sizefac
library(SummarizedExperiment)
se <- SummarizedExperiment(assays=list(counts=complex$counts),
                           rowRanges=complex$bins,
                           colData=DataFrame(cond=c("ctr","tre")))
dr <- diffRegions(count=se,design=~cond,sizefac=sizefac)

## return values
dr
hist(width(dr),nclass=30,xlab="region width",
     main="Width of potential differential regions")
hist(-log10(dr$pvalue),nclass=30,xlab="-log10 pvalue",
     main="Estimated significance")

tengmx/ComplexDiff documentation built on May 31, 2019, 8:34 a.m.