bedtools_merge: bedtools_merge

View source: R/merge.R

bedtools_mergeR Documentation

bedtools_merge

Description

Collapse overlapping and adjacent ranges into a single range, i.e., reduce the ranges. Then, group the original ranges by reduced range and aggregate. By default, the scores are summed.

Usage

bedtools_merge(cmd = "--help")
R_bedtools_merge(i, s = FALSE, S = c("any", "+", "-"), d = 0L, c = NULL,
                 o = "sum", delim = ",")
do_bedtools_merge(i, s = FALSE, S = c("any", "+", "-"), d = 0L, c = NULL,
                  o = "sum", delim = ",")

Arguments

cmd

String of bedtools command line arguments, as they would be entered at the shell. There are a few incompatibilities between the docopt parser and the bedtools style. See argument parsing.

i

Path to a BAM/BED/GFF/VCF/etc file, a BED stream, a file object, or a ranged data structure, such as a GRanges. Use "stdin" for input from another process (presumably while running via Rscript). For streaming from a subprocess, prefix the command string with “<”, e.g., "<grep foo file.bed". Any streamed data is assumed to be in BED format. These are the ranges that are merged.

s

Require same strandedness. That is, find the jaccard feature in b that overlaps a on the same strand. By default, overlaps are reported without respect to strand. Note that this is the exact opposite of Bioconductor behavior.

S

Force merge for one specific strand only. Follow with + or - to force merge from only the forward or reverse strand, respectively. By default, merging is done without respect to strand.

d

Maximum distance between features allowed for features to be merged. Default is 0. That is, overlapping and/or book-ended features are merged.

c

Specify columns (by integer index) from the input file to operate upon (see o option, below). Multiple columns can be specified in a comma-delimited list.

o

Specify the operations (by name) that should be applied to the columns indicated in c. Multiple operations can be specified in a comma-delimited list. Recycling is used to align c and o. See bedtools_groupby for the available operations. Defaults to the “sum” operation.

delim

Delimiter character used to collapse strings.

Details

As with all commands, there are three interfaces to the merge command:

bedtools_merge

Parses the bedtools command line and compiles it to the equivalent R code.

R_bedtools_merge

Accepts R arguments corresponding to the command line arguments and compiles the equivalent R code.

do_bedtools_merge

Evaluates the result of R_bedtools_merge. Recommended only for demonstration and testing. It is best to integrate the compiled code into an R script, after studying it.

The workhorse for reduction is reduce. Passing with.revmap=TRUE to reduce causes it to return a list of integers, which can be passed directly to aggregate to aggregate the original ranges.

Since the grouping information is preserved in the result, this function serves as a proxy for bedtools cluster.

Value

A language object containing the compiled R code, evaluating to a DataFrame with a “grouping” column corresponding to as(hits, "List"), and a column for each summary.

Author(s)

Michael Lawrence

References

http://bedtools.readthedocs.io/en/latest/content/tools/merge.html

See Also

bedtools_groupby for more details on bedtools-style aggregation, reduce for merging, aggregate-methods for aggregating.

Examples

## Not run: 
setwd(system.file("unitTests", "data", "merge", package="HelloRanges"))

## End(Not run)
## default behavior, sum the score
bedtools_merge("-i a.bed")
## count the seqnames
bedtools_merge("-i a.bed -c 1 -o count")
## collapse the names using "|" as the delimiter
bedtools_merge("-i a.names.bed -delim \"|\" -c 4  -o collapse")
## collapse the names and sum the scores
bedtools_merge("-i a.full.bed -c 4,5  -o collapse,sum")
## count and sum the scores
bedtools_merge("-i a.full.bed -c 5  -o count,sum")
## only merge the positive strand features
bedtools_merge("-i a.full.bed -S +")

lawremi/HelloRanges documentation built on Oct. 29, 2023, 4:08 p.m.