Description Usage Arguments Details Value Examples
Calculate the PCR bottleneck coefficient as described in the ENCODE guidelines.
1 | PBC(IP)
|
The |
path to the |
The PCR bottleneck coefficient (PBC) is a measure of library complexity, i.e. how skewed the distribution of read counts per location is towards 1 read per location.
Defined in the ENCODE guidelines (https://genome.ucsc.edu/ENCODE/qualityMetrics.html) as:
PBC = N1/Nd
with
N1
: Number of genomic locations to which EXACTLY one unique mapping read maps.
Nd
: Number of genomic locations to which AT LEAST one unique mapping read maps, i.e.
the number of non-redundant, unique mapping reads.
PBC is further described on the ENCODE Software Tools page. Provisionally, 0-0.5 is severe bottlenecking, 0.5-0.8 is moderate bottlenecking, 0.8-0.9 is mild bottlenecking, while 0.9-1.0 is no bottlenecking. Very low values can indicate a technical problem, such as PCR bias, or a biological finding, such as a very rare genomic feature. Nuclease-based assays (DNase, MNase) detecting features with base-pair resolution (transcription factor footprints, positioned nucleosomes) are expected to recover the same read multiple times, resulting in a lower PBC score for these assays. Note that the most complex library, random DNA, would approach 1.0, thus the very highest values can indicate technical problems with libraries. It is the practice for some labs outside of ENCODE to remove redundant reads; after this has been done, the value for this metric is 1.0, and this metric is not meaningful. 82% of TF ChIP, 89% of His ChIP, 77% of DNase, 98% of FAIRE, and 97% of control ENCODE datasets have no or mild bottlenecking.
The PBC coefficient.
1 | pbc <- PBC("IP.bam")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.