circle_cutter: Cut a replicated circular sequence to generate a linear...

View source: R/circle_cutter.R

circle_cutterR Documentation

Cut a replicated circular sequence to generate a linear sequence

Description

This function cuts a circular sequence using a character string motif to generate a single linear sequence. This is primarily intended for "read-through" type sequences generated, for example, from assembly or long-read sequencing of circular genomes. Such sequences typically have replicated regions that need to be removed to generate a singular un-replicated linear sequence.

Usage

circle_cutter(query_seq, motif)

Arguments

query_seq

Character: The query sequence.

motif

Character, the sequence used to identify cut points in the circular chromosome and to remove replicated regions. See Details.

Details

The argument motif is used to identify a starting and end positions to cut the circular chromosome and remove replciated regions. If for example motif=='AATTGGCC' and the sequence in question was:

AATTGGCC ACTATCTGCTAGCTAGCATAGCATCGATCAGCATGACGCGCAA AATTGGCC

The function will cut the sequence like so (marked with '|'):

| AATTGGCC ACTATCTGCTAGCTAGCATAGCATCGATCAGCATGACGCGCAA | AATTGGCC

Value

Returns the subset sequence as a character string.

Examples

x <- 'AATTGGCCACTATCTGCTAGCTAGCATAGCATCGATCAGCATGACGCGCAAAATTGGCC'

# Find character motif that is repeated
motif_hits <- circularity_test(x, word_size = 8)
motif_seq <- substr(x, motif_hits[1,1], motif_hits[1,2])

circle_cutter(query_seq = x, motif=motif_seq)


j-a-thia/genomalicious documentation built on Oct. 19, 2024, 7:51 p.m.