bedtools_merge: bedtools_merge

View source: R/merge.R

bedtools_mergeR Documentation



Collapse overlapping and adjacent ranges into a single range, i.e., reduce the ranges. Then, group the original ranges by reduced range and aggregate. By default, the scores are summed.


bedtools_merge(cmd = "--help")
R_bedtools_merge(i, s = FALSE, S = c("any", "+", "-"), d = 0L, c = NULL,
                 o = "sum", delim = ",")
do_bedtools_merge(i, s = FALSE, S = c("any", "+", "-"), d = 0L, c = NULL,
                  o = "sum", delim = ",")



String of bedtools command line arguments, as they would be entered at the shell. There are a few incompatibilities between the docopt parser and the bedtools style. See argument parsing.


Path to a BAM/BED/GFF/VCF/etc file, a BED stream, a file object, or a ranged data structure, such as a GRanges. Use "stdin" for input from another process (presumably while running via Rscript). For streaming from a subprocess, prefix the command string with “<”, e.g., "<grep foo file.bed". Any streamed data is assumed to be in BED format. These are the ranges that are merged.


Require same strandedness. That is, find the jaccard feature in b that overlaps a on the same strand. By default, overlaps are reported without respect to strand. Note that this is the exact opposite of Bioconductor behavior.


Force merge for one specific strand only. Follow with + or - to force merge from only the forward or reverse strand, respectively. By default, merging is done without respect to strand.


Maximum distance between features allowed for features to be merged. Default is 0. That is, overlapping and/or book-ended features are merged.


Specify columns (by integer index) from the input file to operate upon (see o option, below). Multiple columns can be specified in a comma-delimited list.


Specify the operations (by name) that should be applied to the columns indicated in c. Multiple operations can be specified in a comma-delimited list. Recycling is used to align c and o. See bedtools_groupby for the available operations. Defaults to the “sum” operation.


Delimiter character used to collapse strings.


As with all commands, there are three interfaces to the merge command:


Parses the bedtools command line and compiles it to the equivalent R code.


Accepts R arguments corresponding to the command line arguments and compiles the equivalent R code.


Evaluates the result of R_bedtools_merge. Recommended only for demonstration and testing. It is best to integrate the compiled code into an R script, after studying it.

The workhorse for reduction is reduce. Passing with.revmap=TRUE to reduce causes it to return a list of integers, which can be passed directly to aggregate to aggregate the original ranges.

Since the grouping information is preserved in the result, this function serves as a proxy for bedtools cluster.


A language object containing the compiled R code, evaluating to a DataFrame with a “grouping” column corresponding to as(hits, "List"), and a column for each summary.


Michael Lawrence


See Also

bedtools_groupby for more details on bedtools-style aggregation, reduce for merging, aggregate-methods for aggregating.


## Not run: 
setwd(system.file("unitTests", "data", "merge", package="HelloRanges"))

## End(Not run)
## default behavior, sum the score
bedtools_merge("-i a.bed")
## count the seqnames
bedtools_merge("-i a.bed -c 1 -o count")
## collapse the names using "|" as the delimiter
bedtools_merge("-i a.names.bed -delim \"|\" -c 4  -o collapse")
## collapse the names and sum the scores
bedtools_merge("-i a.full.bed -c 4,5  -o collapse,sum")
## count and sum the scores
bedtools_merge("-i a.full.bed -c 5  -o count,sum")
## only merge the positive strand features
bedtools_merge("-i a.full.bed -S +")

lawremi/HelloRanges documentation built on April 20, 2022, 5:40 p.m.