bedtools_getfasta: bedtools_getfasta
In lawremi/HelloRanges: Introduce *Ranges to bedtools users

bedtools_getfasta

R Documentation

bedtools_getfasta

Description

Query sequence from a FASTA file given a set of ranges, including compound regions like transcripts and junction reads. This assumes the sequence is DNA.

Usage

    bedtools_getfasta(cmd = "--help")
    R_bedtools_getfasta(fi, bed, s = FALSE, split = FALSE)
    do_bedtools_getfasta(fi, bed, s = FALSE, split = FALSE)

Arguments

`cmd`	String of bedtools command line arguments, as they would be entered at the shell. There are a few incompatibilities between the docopt parser and the bedtools style. See argument parsing.
`fi`	Path to a FASTA file, or an XStringSet object.
`bed`	Path to a BAM/BED/GFF/VCF/etc file, a BED stream, a file object, or a ranged data structure, such as a GRanges, as the query. Use `"stdin"` for input from another process (presumably while running via `Rscript`). For streaming from a subprocess, prefix the command string with “<”, e.g., `"<grep foo file.bed"`. Any streamed data is assumed to be in BED format.
`s`	Force strandedness. If the feature occupies the antisense strand, the sequence will be reverse complemented.
`split`	Given BED12 or BAM input, extract and concatenate the sequences from the blocks (e.g., exons).

Details

As with all commands, there are three interfaces to the getfasta command:

bedtools_getfasta: Parses the bedtools command line and compiles it to the equivalent R code.
R_bedtools_getfasta: Accepts R arguments corresponding to the command line arguments and compiles the equivalent R code.
do_bedtools_getfasta: Evaluates the result of R_bedtools_getfasta. Recommended only for demonstration and testing. It is best to integrate the compiled code into an R script, after studying it.

It is recommended to retrieve reference sequence using a BSgenome package, either custom or provided by Bioconductor. Call getSeq to query for specific regions of the BSgenome object. If one must access a file, consider converting it to 2bit or FA (razip) format for indexed access using import and its which argument.

But if one must access a FASTA file, we need to read all of it with readDNAStringSet and extract regions using x[gr], where gr is a GRanges or GRangesList.

Value

A language object containing the compiled R code, evaluating to a DNAStringSet object.

Author(s)

Michael Lawrence

References

http://bedtools.readthedocs.io/en/latest/content/tools/getfasta.html

Examples

## Not run: 
setwd(system.file("unitTests", "data", "getfasta", package="HelloRanges"))

## End(Not run)
    ## simple query
    bedtools_getfasta("--fi t.fa -bed blocks.bed")
    ## get spliced transcript/read sequence
    bedtools_getfasta("--fi t.fa -bed blocks.bed -split")

lawremi/HelloRanges documentation built on Oct. 29, 2023, 4:08 p.m.

lawremi/HelloRanges index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lawremi/HelloRanges
Introduce *Ranges to bedtools users

bedtools_getfasta: bedtools_getfasta
In lawremi/HelloRanges: Introduce *Ranges to bedtools users