prepare4CseqData: Alignment and filtering of raw 4C-seq data

Description Usage Arguments Value Author(s) References See Also Examples

Description

This function is an optional wrapper for the alignment and preliminary filtering of 4C-seq data. prepare4CseqData reads a provided 4C-seq fastq file from hard disk. Alignment of the reads is done with BWA, the function checkRestrictionEnzymeSequence is used for optional filtering. Samtools and bedtools provide the necessary functionality for intersecting the filtered reads with a given 4C-seq fragment library for visualization purposes (e.g. with the Integrative Genomics Viewer, IGV).

Usage

1
prepare4CseqData(fastqFileName, firstCutter, fragmentLibrary, referenceGenome, pathToBWA = "", pathToSam = "", pathToBED = "", controlCutterSequence = FALSE, bwaThreads = 1, minFragEndLength = 0)

Arguments

fastqFileName

The name of the fastq file that contains the 4C-seq reads

firstCutter

First cutting enzyme sequence for the 4C-seq experiment, e.g. "AAGCTT"

fragmentLibrary

Name of the fragment library to use for the current 4C-seq experiment; has to correspond to the chosen cutters and chosen genome

referenceGenome

Name (plus path) of the reference genome to use

pathToBWA

Path to BWA

pathToSam

Path to samtools

pathToBED

Path to bedtools

controlCutterSequence

If TRUE, the function checkRestrictionEnzymeSequence is used to filter non-valid 4C-seq reads

bwaThreads

Number of BWA threads

minFragEndLength

Minimum fragment end length to use for BED export

Value

computes and writes sorted .bam file for the data, as long as BWA, samtools and bedtools are available

Author(s)

Carolin Walter

References

Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Burrows-Wheeler Transform, Bioinformatics, 25, 1754-60.

Helga Thorvaldsdottir, James T. Robinson, Jill P. Mesirov. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in Bioinformatics 2012.

See Also

checkRestrictionEnzymeSequence

Examples

1
2
3
4
5
    if(interactive()) {
        # BWA, samtools and bedtools must be installed
        # It is assumed that the example data files (from the package) are in the active directory
        prepare4CseqData("veryShortExample.fastq", "CATG", "veryShortLib.csv", referenceGenome = "veryShortReference.fasta")
    }

Basic4Cseq documentation built on Nov. 8, 2020, 6:53 p.m.