findChunks: Identifies 'chunks' of data within a set of aligned reads.

Description Usage Arguments Details Value Author(s) Examples

View source: R/findChunks.R

Description

This function identifies chunks of data within a set of aligned reads by looking for gaps within the alignments; regions where no reads align. If we assume that a locus should not contain a gap of sufficient length, then we can separate the analysis of the data into chunks defined by these gaps, reducing the complexity of the problem of segmentation.

Usage

1
findChunks(alignments, gap, checkDuplication = TRUE, justChunks = FALSE)

Arguments

alignments

A GRanges object defining a set of aligned reads.

gap

The minimum length of a gap across which it is assumed that no locus can exist.

checkDuplication

Should we check whether or not reads are duplicated within a chunk? Defaults to TRUE.

justChunks

If TRUE, returns a vector of the chunks rather than the GRanges object with chunks attached. Defaults to FALSE.

Details

This function is called by the readGeneric and readBAM functions but may usefully be called again if filtering of an linkS4class{alignmentData} object has altered the data present, or to increase the computational effort required for subsequent analysis. The lower the ‘gap’ parameter used to define the chunks, the faster (though potentially less accurate) any subsequent analyses will be.

Value

A modified GRanges object, now containing columns ‘chunk’ and ‘chunkDup’ (if 'checkDuplication' is TRUE), identifying the chunk to which the alignment belongs and whether the alignment of the tag is duplicated within the chunk respectively.

Author(s)

Thomas J. Hardcastle

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Define the files containing sample information.

datadir <- system.file("extdata", package = "segmentSeq")
libfiles <- c("SL9.txt", "SL10.txt", "SL26.txt", "SL32.txt")

# Establish the library names and replicate structure.

libnames <- c("SL9", "SL10", "SL26", "SL32")
replicates <- c(1,1,2,2)

# Read the files to produce an `alignmentData' object.

alignData <- readGeneric(file = libfiles, dir = datadir, replicates =
replicates, libnames = libnames, gap = 100)

# Filter the data on number of matches of each tag to the genome

alignData <- alignData[values(alignData@alignments)$matches < 5,]

# Redefine the chunking structure of the data.

alignData <- findChunks(alignData@alignments, gap = 100)

segmentSeq documentation built on Nov. 8, 2020, 5:18 p.m.