debar | R Documentation |
debar is an R package designed for the identification and removal of insertion and deletion errors from COI-5P DNA barcode data.
debar is built around the DNAseq object, which takes a COI-5P DNA barcode sequence and optionally its associated name and PHRED quality information as input. The package utilizes a nucleotide profile hidden Markov model (PHMM) for the identification of the COI-5P region of an input sequence and the identification and correction of indel errors from within the COI-5P region of the sequence. Indel corrections are by default applied in a conservative fashion, with subsequent censorship of 7 base pairs in either direction of an indel correction to mask most instances where the exact bp corresponding to the indel was not found exactly. Numerous filtering and double check steps are applied, and the package includes functions for input/output for either fasta or fastq formats.
The denoise pipeline is heavily paramaterized so that a user can tailor the denoising execution for their own data structure and goal.
denoise_file
Run the denoise pipeline for each sequence in a specified input file.
denoise
Run the denoise pipeline for a specified sequence
read_fasta
Read data from a fasta file to a data frame.
read_fastq
Read data from a fastq file to a data frame.
write_fasta
Write a denoised sequence to the specified fasta file.
write_fastq
Write a denoised sequence and associated quality information to the specified fastq file.
DNAseq
Builds a DNAseq class object
frame
Match a sequence against the COI-5P PHMM using the
Viterbi algorithm to establish the reading frame,
optional rejection of sequence based on the quality of the match to the PHMM.
adjust
Use the PHMM path output corresponding to the sequence to
adjust the DNA sequence and remove indels.
Optional censorship of sequence around the corrections.
aa_check
Translate the adjusted sequence to amino acids and check it for stop codons.
outseq
Construct the output data for the given sequence.
Optionally can include or exculde sequence data from
outside of the COI-5P region (part of sequence that was not denoised).
example_nt_string
An example COI-5P sequence with no errors.
example_nt_string_errors
An example COI-5P sequence with two indel errors.
Cameron M. Nugent
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.