filterDuplicates: Filter PCR-duplicates from mapped files using internal UMIs

Description Usage Arguments Value Examples

Description

This script considers the read mapping start position and the UMI to determine whether a read is a PCR duplicate. All PCR duplicates are then removed and one entry per read is kept. In case of paired-end reads (MAPCap/RAMPAGE), only one end (R1) is kept after filtering, unless 'keepPairs“ is set to TRUE

Usage

1
2
3
4
filterDuplicates(CSobject, outdir, ncores = 1, keepPairs = FALSE)

## S4 method for signature 'CapSet'
filterDuplicates(CSobject, outdir, ncores = 1, keepPairs = FALSE)

Arguments

CSobject

an object of class CapSet

outdir

character. output directory for filtered BAM files

ncores

integer. No. of cores to use

keepPairs

logical. indicating whether to keep pairs in the paired-end data. (note: the pairs are treated as independent reads during duplicate removal). Also use keepPairs = TRUE for single-end data.

Value

modified CapSet object with filtering information. Filtered BAM files are saved in 'outdir'.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# before running this
# 1. Create a CapSet object
# 2. de-multiplex the fastqs
# 3. map them

# load a previously saved CapSet object
cs <- exampleCSobject()
# filter duplicate reads from mapped BAM files
dir.create("filtered_bam")
cs <- filterDuplicates(cs, outdir = "filtered_bam")

icetea documentation built on Nov. 8, 2020, 6:57 p.m.