blacklist: Make a blacklist for genomic regions

View source: R/blacklist.R

blacklistR Documentation

Make a blacklist for genomic regions

Description

Produce a blacklist of genomic regions with a high ratio of duplicate to unique reads. This blacklist can be used to exclude reads for analysis in Aneufinder, bam2GRanges and bed2GRanges. This function produces a pre-blacklist which has to manually be filtered with a sensible cutoff. See the examples section for details.

Usage

blacklist(files, assembly, bins, min.mapq = 10, pairedEndReads = FALSE)

Arguments

files

A character vector of either BAM or BED files.

assembly

Please see getChromInfoFromUCSC for available assemblies. Only necessary when importing BED files. BAM files are handled automatically. Alternatively a data.frame with columns 'chromosome' and 'length'.

bins

A list with one GRanges-class with binned read counts generated by fixedWidthBins.

min.mapq

Minimum mapping quality when importing from BAM files. Set min.mapq=NA to keep all reads.

pairedEndReads

Set to TRUE if you have paired-end reads in your BAM files (not implemented for BED files).

Value

A GRanges-class with the same coordinates as bins with metadata columns ratio, duplicated counts and deduplicated counts.

Examples

## Get an example BAM file with single-cell-sequencing reads
bamfile <- system.file("extdata", "BB150803_IV_074.bam", package="AneuFinderData")
## Prepare the blacklist
bins <- fixedWidthBins(assembly='mm10', binsizes=1e6, chromosome.format='NCBI')
pre.blacklist <- blacklist(bamfile, bins=bins)
## Plot a histogram to decide on a sensible cutoff
qplot(pre.blacklist$ratio, binwidth=0.1)
## Make the blacklist with cutoff = 1.9
blacklist <- pre.blacklist[pre.blacklist$ratio > 1.9]

ataudt/aneufinder documentation built on April 18, 2023, 4:20 a.m.