stringSim: Function to compute similarity scores between strings

stringSimR Documentation

Function to compute similarity scores between strings

Description

The function can be used to compute similarity scores between strings.

Usage

stringSim(x, y, global = TRUE, match = 1, mismatch = -1, gap = -1, minSim = 0)

Arguments

x

character vector, first string

y

character vector, second string

global

logical; global or local alignment

match

numeric, score for a match between symbols

mismatch

numeric, score for a mismatch between symbols

gap

numeric, penalty for inserting a gap

minSim

numeric, used as required minimum score in case of local alignments

Details

The function computes optimal alignment scores for global (Needleman-Wunsch) and local (Smith-Waterman) alignments with constant gap penalties.

Scoring and trace-back matrix are computed and saved in form of attributes "ScoringMatrix" and "TraceBackMatrix". The characters in the trace-back matrix reflect insertion of a gap in string y (d: deletion), match (m), mismatch (mm), and insertion of a gap in string x (i). In addition stop indicates that the minimum similarity score has been reached.

Value

stringSim returns an object of S3 class "stringSim" inherited from class "dist"; cf. dist.

Note

The function is mainly for teaching purposes.

For distances between strings and string alignments see also Bioconductor package Biostrings.

Author(s)

Matthias Kohl Matthias.Kohl@stamats.de

References

R. Merkl and S. Waack (2009). Bioinformatik Interaktiv. Wiley.

See Also

dist, stringDist

Examples

x <- "GACGGATTATG"
y <- "GATCGGAATAG"

## optimal global alignment score
d <- stringSim(x, y)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")

## optimal local alignment score
d <- stringSim(x, y, global = FALSE)
d
attr(d, "ScoringMatrix")
attr(d, "TraceBackMatrix")

MKmisc documentation built on Nov. 20, 2022, 1:05 a.m.