homopolymerFinder: Find homopolymers
In MarioniLab/sarlacc: Pipeline for Oxford Nanopore RNA-Seq Data Analysis

Description Usage Arguments Details Value Author(s) See Also Examples

Find homopolymer runs in a set of sequences.

1	homopolymerFinder(seq)

seq

A DNAStringSet object.

This function will identify homopolymers in a given set of sequences, where a homopolymer is defined as a consecutive run of the same nucleotide. It is useful for investigating the homopolymer frequency in unknown sequences such as UMIs. If the sequence is known, it is often more informative to use homopolymerMatcher instead.

Gapped sequences are supported - gaps will be ignored when considering homopolymer runs and computing coordinates. However, ambiguous bases are not be given special treatment, and will be handled like any other IUPAC character.

An IRangesList object where each entry corresponds to a sequence in seq. Each IRanges specifies the coordinates and length of a homopolymer run in the current sequence, along with the base being repeated.

Aaron Lun, with contributions from Cheuk-Ting Law

homopolymerMatcher