get_sequence: Get sequence from GRanges

View source: R/generics.R

get_sequenceR Documentation

Get sequence from GRanges

Description

A light wrapper around Biostrings::getSeq to return named DNAStringSets, from input genomic coordinates.

Usage

get_sequence(regions, genome, score_column, ...)

Arguments

regions

GRanges, or GRangesList object. Will also accept a data.frame as long as it can be coerced to a GRanges object, or a string in the format: "chr:start-end" (NOTE: use 1-based closed intervals, not BED format 0-based half-open intervals).

genome

object of any valid type in 'showMethods(Biostrings::getSeq)'. Commonly a BSgenome object, or fasta file. Used to look up sequences in regions.

score_column

optional name of column (in mcols() of 'regions') containing a fasta score that is added to the fasta header of each entry. Used when using [runAme()] in partitioning mode. (default: 'NULL')

...

additional arguments passed to Biostrings::getSeq.

Value

'Biostrings::DNAStringSet' object with names corresponding to genomic coordinates. If input is a list object, output will be a 'Biostrings::BStringSetList' with list names corresponding to input list names.

Examples

# using character string as coordinates
# using BSgenome object for genome
drosophila.genome <- BSgenome.Dmelanogaster.UCSC.dm6::BSgenome.Dmelanogaster.UCSC.dm6
get_sequence("chr2L:100-200", drosophila.genome)

# using GRanges object for coordinates
data(example_peaks, package = "memes")
get_sequence(example_peaks, drosophila.genome)

snystrom/memes documentation built on Oct. 12, 2024, 2:42 a.m.