blatScores: blatScores assesses whether the improper read pairs at a...

View source: R/blat-utils.R

blatScoresR Documentation

blatScores assesses whether the improper read pairs at a rearrangement junction provide strong support of the rearrangement

Description

In the following, we refer to a 'record' as one row in the table of blat output – i.e., one (of possibly many) alignments for a read. This function adds an indicator for whether the blat alignment is consistent with the original alignment (is_overlap), an indicator of whether it passes quality control (passQC) (see details), the id of the rearrangement (rearrangement), and the sample id (id).

Usage

blatScores(blat, tags, id, min.tags = 5, prop.pass = 0.8)

Arguments

blat

a data.frame of results from command-line blat

tags

a data.frame containing read names and the original alignment locations

id

sample id

min.tags

the minimum number of tags that pass BLAT QC for each rearrangement

prop.pass

a length-one numeric vector indicating the fraction of reads at a rearrangement that must pass the read-level QC.

Details

Read-level QC:

For each record, we evaluate

  1. whether the matching score is at least 90

  2. whether the size of the target alignment (Tstart - Tend) is less than 120bp and more than 80bp

  3. whether the blat alignment overlaps with any of the original alignment

The records are then grouped by Qname. For each Qname, we compute the number of reads with a score above 90 and within the specified size range. In addition, we compute the number of reads with a score above 90, within the specified size range, and that overlap with the original alignment. The Qname passes QC if both sums evaluate to 1. That is, a read passes QC only if there is a single record with a high BLAT score within the specified size range and this single record overlaps with the original alignment.

Rearrangement-level QC:

Given a pass / fail designation for each read by the above analysis, we group the reads by the rearrangement id. A rearrangement passes QC if > 80 rearrangement pass QC.

Examples

data(tags)
extdata <- system.file("extdata", package="svbams")
blat.file <- file.path(extdata, "blat_alignment.txt")
blat_aln <- readBlat(blat.file)
blat <- annotateBlatRecords(blat_aln, tags)
blatScores(blat, tags)

cancer-genomics/trellis documentation built on Feb. 2, 2023, 7:04 p.m.