# SequenceMatcher: Character string sequence matching In fuzzywuzzyR: Fuzzy String Matching

## Description

Character string sequence matching

## Usage

 `1` ```# init <- SequenceMatcher\$new(string1 = NULL, string2 = NULL) ```

## Arguments

 `string1` a character string. `string2` a character string.

## Format

An object of class `R6ClassGenerator` of length 24.

## Details

the ratio method returns a measure of the sequences' similarity as a float in the range [0, 1]. Where T is the total number of elements in both sequences, and M is the number of matches, this is 2.0*M / T. Note that this is 1.0 if the sequences are identical, and 0.0 if they have nothing in common. This is expensive to compute if getMatchingBlocks() or getOpcodes() hasnâ€™t already been called, in which case you may want to try quickRatio() or realQuickRatio() first to get an upper bound.

the quick_ratio method returns an upper bound on ratio() relatively quickly.

the real_quick_ratio method returns an upper bound on ratio() very quickly.

the get_matching_blocks method returns a list of triples describing matching subsequences. Each triple is of the form [i, j, n], and means that a[i:i+n] == b[j:j+n]. The triples are monotonically increasing in i and j. The last triple is a dummy, and has the value [a.length, b.length, 0]. It is the only triple with n == 0. If [i, j, n] and [i', j', n'] are adjacent triples in the list, and the second is not the last triple in the list, then i+n != i' or j+n != j'; in other words, adjacent triples always describe non-adjacent equal blocks.

The get_opcodes method returns a list of 5-tuples describing how to turn a into b. Each tuple is of the form [tag, i1, i2, j1, j2]. The first tuple has i1 == j1 == 0, and remaining tuples have i1 equal to the i2 from the preceding tuple, and, likewise, j1 equal to the previous j2. The tag values are strings, with these meanings: 'replace' a[i1:i2] should be replaced by b[j1:j2]. 'delete' a[i1:i2] should be deleted. Note that j1 == j2 in this case. 'insert' b[j1:j2] should be inserted at a[i1:i1]. Note that i1 == i2 in this case. 'equal' a[i1:i2] == b[j1:j2] (the sub-sequences are equal).

## Methods

`SequenceMatcher\$new(string1 = NULL, string2 = NULL)`
`--------------`
`ratio()`
`--------------`
`quick_ratio()`
`--------------`
`real_quick_ratio()`
`--------------`
`get_matching_blocks()`
`--------------`
`get_opcodes()`

## References

https://www.npmjs.com/package/difflib, http://stackoverflow.com/questions/10383044/fuzzy-string-comparison

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23``` ```if (check_availability()) { library(fuzzywuzzyR) s1 = ' It was a dark and stormy night. I was all alone sitting on a red chair.' s2 = ' It was a murky and stormy night. I was all alone sitting on a crimson chair.' init = SequenceMatcher\$new(string1 = s1, string2 = s2) init\$ratio() init\$quick_ratio() init\$real_quick_ratio() init\$get_matching_blocks() init\$get_opcodes() } ```

fuzzywuzzyR documentation built on May 2, 2019, 8:53 a.m.