ssvFetchBamPE: ssvFetchBam for paired-end ChIP-seq files. Only concordant...

Description Usage Arguments Details Value Examples

View source: R/functions_fetch_bamPE.R

Description

Iterates a character vector (ideally named) and calls ssvFetchBamPE.single on each. Appends grouping variable to each resulting data.table and uses rbindlist to efficiently combine results

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
ssvFetchBamPE(
  file_paths,
  qgr,
  unique_names = NULL,
  win_size = 50,
  win_method = c("sample", "summary")[1],
  summary_FUN = stats::weighted.mean,
  anchor = c("left", "left_unstranded", "center", "center_unstranded")[3],
  names_variable = "sample",
  return_data.table = FALSE,
  max_dupes = Inf,
  n_cores = getOption("mc.cores", 1),
  n_region_splits = 1,
  min_isize = 1,
  max_isize = Inf,
  return_unprocessed = FALSE,
  force_skip_centerFix = FALSE,
  ...
)

Arguments

file_paths

character vector of file_paths to load from. Alternatively, file_paths can be a data.frame or data.table whose first column is a character vector of paths and additial columns will be used as metadata.

qgr

Set of GRanges to query. For valid results the width of each interval should be identical and evenly divisible by win_size.

unique_names

names to use in final data.table to designate source bigwig. Default is 'sample'

win_size

The window size that evenly divides widths in qgr.

win_method

character. one of c("sample", "summary"). Determines if viewGRangesWinSample_dt or viewGRangesWinSummary_dt is used to represent each region in qgr.

summary_FUN

function. only relevant if win_method is "summary". passed to viewGRangesWinSummary_dt.

anchor

character, one of c("center", "center_unstranded", "left", "left_unstranded")

names_variable

The column name where unique_names are stored.

return_data.table

logical. If TRUE the internal data.table is returned instead of GRanges. Default is FALSE.

max_dupes

numeric >= 1. duplicate reads by strandd start position over this number are removed, Default is Inf.

n_cores

integer number of cores to use.

n_region_splits

integer number of splits to apply to qgr. The query GRanges will be split into this many roughly equal parts for increased parallelization. Default is 1, no split.

min_isize

integer. Read pairs must have an isize greater than or equal to this value. Default is 1.

max_isize

integer. Read pairs must have an isize less than or equal to this value. Default is Inf.

return_unprocessed

boolean. if TRUE returns read alignment in data.table. Default is FALSE.

force_skip_centerFix

boolean, if TRUE all query ranges will be used "as is". This is already the case by default if win_method == "summary" but may have applications where win_method == "sample".

...

passed to Rsamtools::ScanBamParam() Uses mc.cores option if not supplied.

Details

#' In contrast to ssvFetchBam, extension of reads to estimated fragment size is not an issue as each read pair represents a fragment of exact size.

ssvFetchBamPE iteratively calls fetchWindowedBam.single. See ssvFetchBamPE.single for more info.

if qgr contains the range chr1:1-100 and win_size is 10, values from positions chr1 5,15,25...85, and 95 will be retrieved from bw_file

Value

A tidy formatted GRanges (or data.table if specified) containing fetched values.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
if(Sys.info()['sysname'] != "Windows"){
library(GenomicRanges)
bam_f = system.file("extdata/Bcell_PE.mm10.bam",
    package = "seqsetvis", mustWork = TRUE)
bam_files = c("a" = bam_f, "b" = bam_f)
data("Bcell_peaks")
qgr = Bcell_peaks
bw_gr = ssvFetchBamPE(bam_files, qgr, win_size = 10)
bw_gr2 = ssvFetchBamPE(as.list(bam_files), qgr, win_size = 10)

bw_dt = ssvFetchBamPE(bam_files, qgr, win_size = 10,
    return_data.table = TRUE)
}

seqsetvis documentation built on Nov. 8, 2020, 5:57 p.m.