bedtools_nuc: bedtools_nuc

View source: R/nuc.R

bedtools_nucR Documentation

bedtools_nuc

Description

Summarize DNA sequences over the specified ranges.

Usage

    bedtools_nuc(cmd = "--help")
    R_bedtools_nuc(fi, bed, s = FALSE, pattern = NULL, fullHeader = FALSE)
    do_bedtools_nuc(fi, bed, s = FALSE, pattern = NULL, fullHeader = FALSE)

Arguments

cmd

String of bedtools command line arguments, as they would be entered at the shell. There are a few incompatibilities between the docopt parser and the bedtools style. See argument parsing.

fi

Path to a FASTA file, or an XStringSet.

bed

Path to a BAM/BED/GFF/VCF/etc file, a BED stream, a file object, or a ranged data structure, such as a GRanges, as the query. Use "stdin" for input from another process (presumably while running via Rscript). For streaming from a subprocess, prefix the command string with “<”, e.g., "<grep foo file.bed". Any streamed data is assumed to be in BED format.

s

Force strandedness. If the feature occupies the antisense strand, the sequence will be reverse complemented.

pattern

Optional sequence pattern to count in each subsequence.

fullHeader

Use the full FASTA header as the names. By default, use just the first word.

Details

As with all commands, there are three interfaces to the nuc command:

bedtools_nuc

Parses the bedtools command line and compiles it to the equivalent R code.

R_bedtools_nuc

Accepts R arguments corresponding to the command line arguments and compiles the equivalent R code.

do_bedtools_nuc

Evaluates the result of R_bedtools_nuc. Recommended only for demonstration and testing. It is best to integrate the compiled code into an R script, after studying it.

Computes AT/GC percentage and counts each type of base. Relies on Biostrings utilities like letterFrequency and alphabetFrequency. The counting of pattern occurrences uses vcountPattern.

Value

A language object containing the compiled R code, evaluating to a DataFrame with summary statistics including the AC and GT percentage, and the counts of each type of base. Also includes the count of pattern, if specified.

Author(s)

Michael Lawrence

References

http://bedtools.readthedocs.io/en/latest/content/tools/nuc.html

See Also

letterFrequency for summarizing sequences, matchPattern for pattern matching.

Examples

## Not run: 
setwd(system.file("unitTests", "data", "nuc", package="HelloRanges"))

## End(Not run)
    ## default behavior, note the two dashes in '--fi'
    bedtools_nuc("--fi test.fasta -bed a.bed")
    ## with pattern counting
    bedtools_nuc("--fi test.fasta -bed a.bed -pattern ATA")

lawremi/HelloRanges documentation built on Oct. 29, 2023, 4:08 p.m.