ssvFetchBamPE: ssvFetchBam for paired-end ChIP-seq files. Only concordant...
In jrboyd/seqsetvis: Set Based Visualizations for Next-Gen Sequencing Data

ssvFetchBamPE

R Documentation

ssvFetchBam for paired-end ChIP-seq files. Only concordant reads are considered, but this has been minimally tested, please verify.

Description

Iterates a character vector (ideally named) and calls ssvFetchBamPE.single on each. Appends grouping variable to each resulting data.table and uses rbindlist to efficiently combine results

Usage

ssvFetchBamPE(
  file_paths,
  qgr,
  unique_names = NULL,
  win_size = 50,
  win_method = c("sample", "summary")[1],
  summary_FUN = stats::weighted.mean,
  fragLens = "not_used",
  anchor = c("left", "left_unstranded", "center", "center_unstranded")[3],
  names_variable = "sample",
  return_data.table = FALSE,
  max_dupes = Inf,
  n_cores = getOption("mc.cores", 1),
  n_region_splits = 1,
  min_isize = 1,
  max_isize = Inf,
  return_unprocessed = FALSE,
  return_fragSizes = FALSE,
  force_skip_centerFix = FALSE,
  ...
)

Arguments

`file_paths`	character vector of file_paths to load from. Alternatively, file_paths can be a data.frame or data.table whose first column is a character vector of paths and additial columns will be used as metadata.
`qgr`	Set of GRanges to query. For valid results the width of each interval should be identical and evenly divisible by `win_size`.
`unique_names`	names to use in final data.table to designate source bigwig. Default is 'sample'
`win_size`	The window size that evenly divides widths in `qgr`.
`win_method`	character. one of c("sample", "summary"). Determines if `viewGRangesWinSample_dt` or `viewGRangesWinSummary_dt` is used to represent each region in qgr.
`summary_FUN`	function. only relevant if win_method is "summary". passed to `viewGRangesWinSummary_dt`.
`fragLens`	never used by ssvFetchBamPE Ignore.
`anchor`	character, one of c("center", "center_unstranded", "left", "left_unstranded")
`names_variable`	The column name where unique_names are stored.
`return_data.table`	logical. If TRUE the internal data.table is returned instead of GRanges. Default is FALSE.
`max_dupes`	numeric >= 1. duplicate reads by strandd start position over this number are removed, Default is Inf.
`n_cores`	integer number of cores to use.
`n_region_splits`	integer number of splits to apply to qgr. The query GRanges will be split into this many roughly equal parts for increased parallelization. Default is 1, no split.
`min_isize`	integer. Read pairs must have an isize greater than or equal to this value. Default is 1.
`max_isize`	integer. Read pairs must have an isize less than or equal to this value. Default is Inf.
`return_unprocessed`	boolean. if TRUE returns read alignment in data.table. Default is FALSE.
`return_fragSizes`	boolean. if TRUE returns fragment sizes for all reads per region.
`force_skip_centerFix`	boolean, if TRUE all query ranges will be used "as is". This is already the case by default if win_method == "summary" but may have applications where win_method == "sample".
`...`	passed to Rsamtools::ScanBamParam() Uses mc.cores option if not supplied.

Details

#' In contrast to ssvFetchBam, extension of reads to estimated fragment size is not an issue as each read pair represents a fragment of exact size.

ssvFetchBamPE iteratively calls fetchWindowedBam.single. See ssvFetchBamPE.single for more info.

if qgr contains the range chr1:1-100 and win_size is 10, values from positions chr1 5,15,25...85, and 95 will be retrieved from bw_file

Value

A tidy formatted GRanges (or data.table if specified) containing fetched values.

Examples

if(Sys.info()['sysname'] != "Windows"){
library(GenomicRanges)
bam_f = system.file("extdata/Bcell_PE.mm10.bam",
    package = "seqsetvis", mustWork = TRUE)
bam_files = c("a" = bam_f, "b" = bam_f)
data("Bcell_peaks")
qgr = Bcell_peaks
bw_gr = ssvFetchBamPE(bam_files, qgr, win_size = 10)
bw_gr2 = ssvFetchBamPE(as.list(bam_files), qgr, win_size = 10)

bw_dt = ssvFetchBamPE(bam_files, qgr, win_size = 10,
    return_data.table = TRUE)
}

jrboyd/seqsetvis documentation built on Jan. 16, 2025, 10:25 a.m.