preprocessData: Create a list of relevant information for calling...

View source: R/deletion-utils.R

preprocessDataR Documentation

Create a list of relevant information for calling deletions/amplifications

Description

Collects preprocessed bin-level log2 ratios, segmentation, proper read pairs surrounding deletions, improper read pairs supporting deletions, a path to the bam file, and the reference genome build of the bam file into a comprehensive list that can be used as input to the sv_deletions and sv_amplicons2 functions.

Usage

preprocessData(bam.file = NULL, genome, bins, segments, read_pairs)

Arguments

bam.file

length-one character vector providing path to BAM file

genome

length-one character vector providing genome build (hg18 or hg19)

bins

a GRanges object with log_ratio in the mcols

segments

a GRanges object with seg.mean in the mcols

read_pairs

a length 2 list of GAlignmentPairs objects. One GAlignmentPairs object should have the name proper_del and contain proper read pairs that surround putative deletions obtained from the properReadPairs function. The second GAlignmentPairs object should have the name improper and contain improper reads supporting putative deletions obtained from the getImproperAlignmentPairs function.

Value

a list object

Examples

library(svbams)
library(svfilters.hg19)
data(bins1kb)
extdata <- system.file("extdata", package="svbams")
bamfile <- file.path(extdata, "cgov44t_revised.bam")
## Extract all improper readpairs
what <- c("flag", "mrnm", "mpos", "mapq")
iparams <- improperAlignmentParams(what=what)
improper_rp <- getImproperAlignmentPairs(bamfile,
                                         param=iparams,
                                         build="hg19")

ddir <- system.file("extdata", package="svbams",
                    mustWork=TRUE)
## load normalized read depth (see trellis vignette)
lr <- readRDS(file.path(ddir, "preprocessed_coverage.rds"))/1000
seqlevels(bins1kb, pruning.mode="coarse") <- paste0("chr", c(1:22, "X"))
bins1kb$log_ratio <- lr
bins <- keepSeqlevels(bins1kb, c("chr5", "chr8", "chr15"),
                      pruning.mode="coarse")
## Load segmentation data
path <- system.file("extdata", package="svbams")
segs <- readRDS(file.path(path, "cgov44t_segments.rds"))
seqlevels(segs, pruning.mode="coarse") <- seqlevels(bins)

## candidate deletions
dp <- DeletionParam(remove_hemizygous=FALSE)
dp
del.gr <- IRanges::reduce(segs[segs$seg.mean < hemizygousThr(dp)],
                          min.gapwidth=2000)

## sample properly and improperly paired read pairs near candidate deletions
proper_rp <- properReadPairs(bamfile, gr=del.gr, dp)
improper_rp <- keepSeqlevels(improper_rp, seqlevels(segs),
                             pruning.mode="coarse")
read_pairs <- list(proper_del=proper_rp, improper=improper_rp)
## Collect data from preprocessing in a single list object
pdata <- preprocessData(bam.file=bamfile,
                        genome="hg19",
                        bins=bins1kb,
                        segments=segs,
                        read_pairs=read_pairs)

cancer-genomics/trellis documentation built on Aug. 20, 2024, 5:48 p.m.