getRMSfragments: Retrieving RMS fragments from genomes

View source: R/fragments.R

getRMSfragmentsR Documentation

Retrieving RMS fragments from genomes

Description

Retrieves a set of fragments from a genome, given restriction enzyme cutting motifs.

Usage

getRMSfragments(
  genome,
  genome.id,
  min.length = 30,
  max.length = 500,
  left = "G|AATTC",
  right = "T|TAA",
  verbose = TRUE
)

Arguments

genome

A table (fasta object) with genome data.

genome.id

Unique identifier for each genome, will be added to FASTA-headers (text).

min.length

Minimum fragment length (integer).

max.length

Maximum fragment length (integer).

left

Text with first, long, restriction enzyme cut motif (text).

right

Text with second, short, restriction enzyme cut motif (text).

verbose

Turn on/off output text during processing (logical).

Details

This function is used to find and retrieve all RMS fragments from a genome. This is a tibble with sequence data in FASTA format, see readFasta. In addition, a genome.id is required, which is a text unique to each genome to be analyzed. This genome.id will be added to the fasta headers of the output, and all headers start with the token <genome.id>_RMSx, where x is an integer (1,2,...,). This first token is followed by a blank. This ensures that all first tokens are unique and that the genome of its origin is indicated for all fragments

The default restriction enzymes are EcoRI and MseI, with cutting motifs "G|AATTC" and "T|TAA", respectively. The vertical bar indicates the cut site in the motif.

Value

A tibble with with all fragment sequences (5'-3') in FASTA format.

Author(s)

Lars Snipen.

See Also

RMSobject.

Examples

# A small genome in this package
xpth <- file.path(path.package("microrms"),"extdata")
genome.file <- file.path(xpth,"GCF_000009605.1_ASM960v1_genomic.fna")

# Read genome, find fragments
gnm <- readFasta(genome.file)
frg <- getRMSfragments(gnm, "genome_1")

# Write to file with writeFasta(frg, out.file = <filename>)


larssnip/microRMS documentation built on July 19, 2023, 1:06 a.m.