checkRestrictionEnzymeSequence: Remove invalid 4C-seq reads from a SAM file

Description Usage Arguments Details Value Note Author(s) Examples

Description

Basic4Cseq offers filter functions for invalid 4C-seq reads. This function removes 4C-seq reads from a provided Sequence Alignment/Map (SAM) file that show mismatches in the restriction enzyme sequence.

Usage

1
checkRestrictionEnzymeSequence(firstCutter, inputFileName, outputFileName = "output.sam", keepOnlyUniqueReads = TRUE, writeStatistics = TRUE)

Arguments

firstCutter

First restriction enzyme sequence of the 4C-seq experiment

inputFileName

Name of the input SAM file that contains aligned reads for the 4C-seq experiment

outputFileName

Name of the output SAM file that is created to store the filtered 4C-seq reads

keepOnlyUniqueReads

If TRUE, delete non-unique reads. Information in the SAM flag field is used to determine whether a read is unique or not.

writeStatistics

If TRUE, write statistics (e.g. the number of unique reads) to a text file

Details

Valid 4C-seq reads start at a primary restriction site and continue with its downstream sequence, so any mismatch in the restriction enzyme sequence of a read is an indicator for a mismatch. The mapping information of the restriction enzyme sequence bases of a read (if present) can be used for filtering purposes. checkRestrictionEnzymeSequence tests the first bases of a read (depending on the length of the first restriction enzyme either 4 or 6 bp long) for mismatches. Reads with mismatches in the restriction enzyme sequence are deleted, the filtered data is then written to a new SAM file. The function does not yet differentiate between blind and nonblind fragments, but removes potential misalignments that may overlap with valid fragment ends and distort the true 4C-seq signal.

Value

A SAM file containing the filtered valid 4C-seq reads

Note

The use of the function is only possible if the restriction enzyme sequence is not trimmed or otherwise absent.

Author(s)

Carolin Walter

Examples

1
2
3
4
    if(interactive()) {
        file <- system.file("extdata", "fetalLiverCutter.sam", package="Basic4Cseq")
        checkRestrictionEnzymeSequence("aagctt", file)
    }

Basic4Cseq documentation built on Nov. 8, 2020, 6:53 p.m.