homopolymerFinder: Find homopolymers

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Find homopolymer runs in a set of sequences.

Usage

1

Arguments

seq

A DNAStringSet object.

Details

This function will identify homopolymers in a given set of sequences, where a homopolymer is defined as a consecutive run of the same nucleotide. It is useful for investigating the homopolymer frequency in unknown sequences such as UMIs. If the sequence is known, it is often more informative to use homopolymerMatcher instead.

Gapped sequences are supported - gaps will be ignored when considering homopolymer runs and computing coordinates. However, ambiguous bases are not be given special treatment, and will be handled like any other IUPAC character.

Value

An IRangesList object where each entry corresponds to a sequence in seq. Each IRanges specifies the coordinates and length of a homopolymer run in the current sequence, along with the base being repeated.

Author(s)

Aaron Lun, with contributions from Cheuk-Ting Law

See Also

homopolymerMatcher

Examples

1
2
seq <- DNAStringSet(c("AAAAAGGGGGCCCCCCTTTTT", "AAAAGGGGGCCCCTTTTT"))
homopolymerFinder(seq)

MarioniLab/sarlacc documentation built on May 13, 2019, 12:51 p.m.