dba.overlap: Compute binding site overlaps (occupancy analysis)

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/DBA.R

Description

Computes binding overlaps and co-occupancy statistics

Usage

1
2
3
4
5
dba.overlap(DBA, mask, mode=DBA_OLAP_PEAKS, 
            contrast, method=DBA$config$AnalysisMethod, th=DBA$config$th, 
            bUsePval=DBA$config$bUsePval, 
            report, byAttribute, bCorOnly=TRUE, CorMethod="pearson", 
            DataType=DBA$config$DataType)

Arguments

DBA

DBA object

mask

mask or vector of peakset numbers indicating a subset of peaksets to use (see dba.mask). When generating overlapping/unique peaksets, either two, three, or four peaksets may be specified. If the mode type is DBA_OLAP_ALL, and a contrast is specified, a value of TRUE (mask=TRUE) indicates that all samples should be included (otherwise only those present in one of the contrast groups will be included).

mode

indicates which results should be returned (see MODES below). One of:

  • DBA_OLAP_PEAKS

  • DBA_OLAP_ALL

  • DBA_OLAP_RATE

contrast

contrast number to use. Only specified if contrast data is to be used when mode=DBA_OLAP_ALL. See dba.show(DBA, bContrast=T) to get contrast numbers.

method

if contrast is specified and mode=DBA_OLAP_ALL, use data from method used for analysis:

  • DBA_DESEQ2

  • DBA_DESEQ2_BLOCK

  • DBA_EDGER

  • DBA_EDGER_BLOCK

th

if contrast is specified and mode=DBA_OLAP_ALL, significance threshold; all sites with FDR (or p-values, see bUsePval) less than or equal to this value will be included. A value of 1 will include all binding sites, but only the samples included in the contrast.

bUsePval

if contrast is specified and mode=DBA_OLAP_ALL, logical indicating whether to use FDR (FALSE) or p-value (TRUE) for thresholding.

report

if contrast is specified and mode=DBA_OLAP_ALL, a report (obtained from dba.report) specifying the data to be used. If counts are included in the report (and a contrast is specified), the count data from the report will be used to compute correlations, rather than the scores in the global binding affinity matrix. If report is present, the method, th, and bUsePval parameters are ignored.

byAttribute

when computing co-occupancy statistics (DBA_OLAP_ALL), limit comparisons to peaksets with the same value for a specific attribute, one of:

  • DBA_ID

  • DBA_TISSUE

  • DBA_FACTOR

  • DBA_CONDITION

  • DBA_TREATMENT

  • DBA_REPLICATE

  • DBA_CONSENSUS

  • DBA_CALLER

bCorOnly

when computing co-occupancy statistics (DBA_OLAP_ALL), logical indicating that only correlations, and not overlaps, should be computed. This is much faster if only correlations are desired (e.g. to plot the correlations using dba.plotHeatmap).

CorMethod

when computing co-occupancy statistics (DBA_OLAP_ALL), method to use when computing correlations.

DataType

if mode==DBA_OLAP_PEAKS, the class of object that peaksets should be returned as:

  • DBA_DATA_GRANGES

  • DBA_DATA_RANGEDDATA

  • DBA_DATA_FRAME

Can be set as default behavior by setting DBA$config$DataType.

Details

MODE: Generate overlapping/unique peaksets:

dba.overlap(DBA, mask, mode=DBA_OLAP_PEAKS, minVal)

MODE: Compute correlation and co-occupancy statistics (e.g. for dba.plotHeatmap):

dba.overlap(DBA, mask, mode=DBA_OLAP_ALL, byAttribute, minVal, attributes, bCorOnly, CorMethod)

MODE: Compute correlation and co-occupancy statistics using significantly differentially bound sites (e.g. for dba.plotHeatmap):

dba.overlap(DBA, mask, mode=DBA_OLAP_ALL, byAttribute, minVal, contrast, method, th=, bUsePval, attributes, bCorOnly, CorMethod)

Note that the scores from the global binding affinity matrix will be used for correlations unless a report containing count data is specified.

MODE: Compute overlap rates at different stringency thresholds:

dba.overlap(DBA, mask, mode=DBA_OLAP_RATE, minVal)

Value

Value depends on the mode specified in the mode parameter.

If mode=DBA_OLAP_PEAKS, Value is an overlap record: a list of three peaksets for an A-B overlap, seven peaksets for a A-B-C overlap, and fifteen peaksets for a A-B-C-D overlap:

inAll

peaks in all peaksets

onlyA

peaks unique to peakset A

onlyB

peaks unique to peakset B

onlyC

peaks unique to peakset C

onlyD

peaks unique to peakset D

notA

peaks in all peaksets except peakset A

notB

peaks in all peaksets except peakset B

notC

peaks in all peaksets except peakset C

notD

peaks in all peaksets except peakset D

AandB

peaks in peaksets A and B but not in peaksets C or D

AandC

peaks in peaksets A and C but not in peaksets B or D

AandD

peaks in peaksets A and D but not in peaksets B or C

BandC

peaks in peaksets B and C but not in peaksets A or D

BandD

peaks in peaksets B and D but not in peaksets A or C

CandD

peaks in peaksets C and D but not in peaksets A or B

If mode=DBA_OLAP_ALL, Value is a correlation record: a matrix with a row for each pair of peaksets and the following columns:

A

peakset number of first peakset in overlap

B

peakset number of second peakset in overlap

onlyA

number of sites unique to peakset A

onlyB

number of sites unique to peakset B

inAll

number of peaks in both peakset A and B (merged)

R2

correlation value A vs B

Overlap

percentage overlap (number of overlapping sites divided by number of peaks unique to smaller peakset

If mode=DBA_OLAP_RATE, Value is a vector whose length is the number of peaksets, containing the number of overlapping peaks at the corresponding minOverlaps threshold (i.e., Value[1] is the total number of unique sites, Value[2] is the number of unique sites appearing in at least two peaksets, Value[3] the number of sites overlapping in at least three peaksets, etc.).

Author(s)

Rory Stark

See Also

dba.plotVenn, dba.plotHeatmap

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
data(tamoxifen_peaks)
# default mode: DBA_OLAP_PEAKS -- get overlapping/non overlapping peaksets
mcf7 <- dba.overlap(tamoxifen,tamoxifen$masks$MCF7&tamoxifen$masks$Responsive)
names(mcf7)
mcf7$inAll

# mode:  DBA_OLAP_ALL -- get correlation record
mcf7 <- dba(tamoxifen,tamoxifen$masks$MCF7)
mcf7.corRec <- dba.overlap(mcf7,mode=DBA_OLAP_ALL,bCorOnly=FALSE)
mcf7.corRec

# mode: DBA_OLAP_RATE -- get overlap rate vector
data(tamoxifen_peaks)
rate <- dba.overlap(tamoxifen, mode=DBA_OLAP_RATE)
rate
plot(rate,type='b',xlab="# peaksets",ylab="# common peaks",
     main="Tamoxifen dataset overlap rate")

DiffBind documentation built on March 24, 2021, 6 p.m.