dualFilter | R Documentation |
Apply two filters to counts generated using sliding windows
dualFilter(
x,
bg = NULL,
ref,
q = 0.5,
logCPM = TRUE,
keep.totals = TRUE,
bin.size = NULL,
prior.count = 2,
BPPARAM = bpparam()
)
x |
RangedSummarizedExperiment containing sample counts |
bg |
RangedSummarizedExperiment containing background/input counts, or alternate method for selecting samples from within x, such as a logical, numeric or character vector |
ref |
GRanges object containing ranges where signal is expected |
q |
The upper percentile of the reference ranges expected to be returned when tuning the filtering criteria |
logCPM |
logical(1) Add a logCPM assay to the returned data |
keep.totals |
logical(1) Keep the original library sizes or replace using only the retained windows |
bin.size |
Bin sizes when calling filterWindowsControl. If not specified will default to the largest of 2000bp or 10x the window size |
prior.count |
Passed to filterWindowsControl and filterWindowsProportion |
BPPARAM |
Settings for running in parallel |
This function will take sliding (or tiling) windows for it's input as a RangedSummarizedExperiment object. The dual strategy of applying filterWindowsControl and filterWindowsProportion will then be applied. A set of reference ranges for which signal is expected is used to refine the filtering criteria.
Cutoff values are found for both signal relative to input and overall signal,
such that the 100*q%
of the (sliding) windows which overlap a reference
range will be returned, along with any others which match the
dual filtering criteria.
In general, higher values of q
will return more windows as those with weak
signal and a marginal overlap with a reference range will be returned.
Lower values will ensure that fewer windows, generally with the strongest
signal, are retained.
Cutoff values for both criteria are added to the metadata
element of the returned object.
If setting bg = NULL
the filterWindowsControl step will be
ignored and only the filterWindowsProportion will be used.
This should only be performed if no Input sample is available.
Please note that the any .bam
files referred to in the supplied objects
must be accessible to this function. It will not run on a separate
machine or file structure to that which the original sliding windows were
prepared. Please see the example/vignette for runnable code.
A RangedSummarizedExperiment which is a
filtered subset of the original object. If requested the assay "logCPM" will
be added (TRUE
by default)
## Taken from the differential_binding vignette
library(tidyverse)
library(Rsamtools)
library(csaw)
library(BiocParallel)
library(rtracklayer)
## For this function we need a set of counts using sliding windows and the
## original BamFiles from which they were taken
## First we'll set up the bam file list
bfl <- system.file(
"extdata", "bam", c("ex1.bam", "ex2.bam", "input.bam"), package = "extraChIPs"
) %>%
BamFileList() %>%
setNames(c("ex1", "ex2", "input"))
## Then define the readParam settings for csaw::readParam()
rp <- readParam(
pe = "none",
dedup = TRUE,
restrict = "chr10"
)
## Now we can form our sliding window object with the counts.
wincounts <- windowCounts(
bam.files = bfl,
spacing = 60,
width = 180,
ext = 200,
filter = 1,
param = rp
)
## As this is a subset of reads, add the initial library sizes for accuracy
## Note that this step is not normally required
wincounts$totals <- c(964076L, 989543L, 1172179L)
## We should also update the metadata for our counts
wincounts$sample <- colnames(wincounts)
wincounts$treat <- as.factor(c("ctrl", "treat", NA))
colData(wincounts)
## The function dualFilter requires a set of peaks which will guide the
## filtering step. This indicate where genuine signal is likely to be found
## and will perform the filtering based on a) signal above the input, and
## b) The overall signal level, using the guide set of peaks to inform the
## cutoff values for inclusion
peaks <- import.bed(
system.file("extdata", "peaks.bed.gz", package = "extraChIPs")
)
filtcounts <- dualFilter(
x = wincounts, bg = "input", ref = peaks,
q = 0.8 # Better to use q = 0.5 on real data
)
filtcounts
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.