levenR
provides a few functions for simple Levenshtein alignment and distance calculation with multiple threads, ends-free and reduced homopolymer gap costs.
To install directly from github, use the devtools
library and run:
devtools::install_github("sherrillmix/levenR")
An example of calculating the Levenshtein distance between several strings to make a distance matrix:
library(levenR) seqs<-c('AAATA','AATA','AAAT','ACCTA') leven(seqs)
An example of calculating the Levenshtein distance between several strings against a longer reference sequence:
library(levenR) seqs<-c('AAATA','AATA','AAAT','ACCTA') ref<-'CCAAATACCGACC' leven(seqs,ref,substring2=TRUE)
An example of calculating the Levenshtein distance between several strings against two longer reference sequences and determining the best match for each read:
library(levenR) seqs<-c('AAATA','AATA','AAAT','ACCTA') refs<-c('CCATAATACCGACC','GGAAATACCTA') dist<-leven(seqs,refs,substring2=TRUE) apply(dist,1,which.min)
An example of calculating the Levenshtein distance between several strings to make a distance matrix while ignoring indels in long homopolymers (an error type common in 454 and IonTorrent sequencing):
library(levenR) seqs<-c('AAAAATA','AAATTTTTA','AAAAATTTA') leven(seqs,homoLimit=3)
An example of calculating the Levenshtein distance between several strings using multiple threads:
library(levenR) seqs<-replicate(50,paste(sample(letters,100,TRUE),collapse='')) system.time(leven(seqs)) system.time(leven(seqs,nThreads=4))
An example of aligning strings against a longer reference:
library(levenR) seqs<-c('AAATA','AATA','AAAT','ACCTA') ref<-'CCAAATACCGACC' levenAlign(seqs,ref,substring2=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.