Query sequence from a FASTA file given a set of ranges, including compound regions like transcripts and junction reads. This assumes the sequence is DNA.
1 2 3
String of bedtools command line arguments, as they would be entered at the shell. There are a few incompatibilities between the docopt parser and the bedtools style. See argument parsing.
Path to a BAM/BED/GFF/VCF/etc file, a BED stream, a file object, or
a ranged data structure, such as a GRanges. Use
Column index(es) for grouping the input. Columns may be comma-separated. By default, the grouping is by range.
Specify columns (by integer index) from the input file to operate
Specify the operations (by name) that should be applied to the
columns indicated in
Delimiter character used to collapse strings.
As with all commands, there are three interfaces to the
Parses the bedtools command line and compiles it to the equivalent R code.
Accepts R arguments corresponding to the command line arguments and compiles the equivalent R code.
Evaluates the result of
R_bedtools_groupby. Recommended only for
demonstration and testing. It is best to integrate the compiled
code into an R script, after studying it.
The workhorse for aggregation in R is
aggregate and we have extended its
interface to make it more convenient. See
aggregate for details.
The following operations are supported (with R translation):
For the sake of simplicity, and because the use cases are not clear, we do not support aggregation of every column. Here are some of the restrictions:
No support for the last column of GFF (the ragged list of attributes).
No support for the INFO, FORMAT and GENO fields of VCF.
No support for the FLAG field of BAM (
not support this either).
A language object containing the compiled R code, generally evaluating to a DataFrame, with a column for each grouping variable and each summarized variable. As a special case, if there are no grouping variables specified, then the grouping is by range, and an aggregated GRanges is returned.
We admit that using column subscripts for
c makes code hard
to read. All the more reason to just write R code.
aggregate-methods for general aggregation.
1 2 3 4 5 6 7 8 9 10 11 12
## Not run: setwd(system.file("unitTests", "data", "groupby", package="HelloRanges")) ## End(Not run) ## aggregation by range bedtools_groupby("-i values3.header.bed -c 5") ## average variant qualities by chromosome and reference base ## Not run: indexTabix(bgzip("a_vcfSVtest.vcf", overwrite=TRUE), "vcf") ## End(Not run) bedtools_groupby("-i a_vcfSVtest.vcf.bgz -g 1,4 -c 6 -o mean")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.