Introduction to baseq"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(baseq)

Introduction

baseq is a basic sequence processing tool for biological data. It provides simple and efficient functions for common tasks in molecular biology, such as cleaning sequences, translating DNA/RNA to protein, and calculating GC content.

Sequence Cleaning

You can clean DNA or RNA sequences by removing any non-standard characters. The universal clean_seq() function automatically detects the type.

dna_seq <- "ATGCnNryMK"
clean_seq(dna_seq)

rna_seq <- "AUGGCuuNnRYMK"
clean_seq(rna_seq)

Translation

baseq can translate DNA and RNA sequences into protein sequences in all six reading frames.

dna_seq <- "ATCGAGCTAGCTAGCTAGCTAGCT"
proteins <- dna_to_protein(dna_seq)
proteins[["Frame F1"]]

GC Content

Calculate the GC content of a DNA sequence.

dna_seq <- "ATGCATGC"
gc_content(dna_seq)

Reading and Writing Files

baseq provides universal functions to read and write FASTA and FASTQ files.

# Read a FASTA file into a dataframe
# df <- read_seq("path/to/file.fasta")

# Write a dataframe to a FASTA file
# write_seq(df, "output.fasta")

For more details, see the documentation for individual functions.



Try the baseq package in your browser

Any scripts or data that you put into this service are public.

baseq documentation built on March 12, 2026, 1:07 a.m.