Description Usage Details Methods Methods References Examples
Character string sequence matching
Character string sequence matching
1 | # init <- SequenceMatcher$new(string1 = NULL, string2 = NULL)
|
the ratio method returns a measure of the sequences' similarity as a float in the range [0, 1]. Where T is the total number of elements in both sequences, and M is the number of matches, this is 2.0*M / T. Note that this is 1.0 if the sequences are identical, and 0.0 if they have nothing in common. This is expensive to compute if getMatchingBlocks() or getOpcodes() hasn’t already been called, in which case you may want to try quickRatio() or realQuickRatio() first to get an upper bound.
the quick_ratio method returns an upper bound on ratio() relatively quickly.
the real_quick_ratio method returns an upper bound on ratio() very quickly.
the get_matching_blocks method returns a list of triples describing matching subsequences. Each triple is of the form [i, j, n], and means that a[i:i+n] == b[j:j+n]. The triples are monotonically increasing in i and j. The last triple is a dummy, and has the value [a.length, b.length, 0]. It is the only triple with n == 0. If [i, j, n] and [i', j', n'] are adjacent triples in the list, and the second is not the last triple in the list, then i+n != i' or j+n != j'; in other words, adjacent triples always describe non-adjacent equal blocks.
The get_opcodes method returns a list of 5-tuples describing how to turn a into b. Each tuple is of the form [tag, i1, i2, j1, j2]. The first tuple has i1 == j1 == 0, and remaining tuples have i1 equal to the i2 from the preceding tuple, and, likewise, j1 equal to the previous j2. The tag values are strings, with these meanings: 'replace' a[i1:i2] should be replaced by b[j1:j2]. 'delete' a[i1:i2] should be deleted. Note that j1 == j2 in this case. 'insert' b[j1:j2] should be inserted at a[i1:i1]. Note that i1 == i2 in this case. 'equal' a[i1:i2] == b[j1:j2] (the sub-sequences are equal).
SequenceMatcher$new(string1 = NULL, string2 = NULL)
--------------
ratio()
--------------
quick_ratio()
--------------
real_quick_ratio()
--------------
get_matching_blocks()
--------------
get_opcodes()
new()
SequenceMatcher$new(string1 = NULL, string2 = NULL)
string1
a character string.
string2
a character string.
ratio()
SequenceMatcher$ratio()
quick_ratio()
SequenceMatcher$quick_ratio()
real_quick_ratio()
SequenceMatcher$real_quick_ratio()
get_matching_blocks()
SequenceMatcher$get_matching_blocks()
get_opcodes()
SequenceMatcher$get_opcodes()
clone()
The objects of this class are cloneable with this method.
SequenceMatcher$clone(deep = FALSE)
deep
Whether to make a deep clone.
https://www.npmjs.com/package/difflib, http://stackoverflow.com/questions/10383044/fuzzy-string-comparison
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | try({
if (reticulate::py_available(initialize = FALSE)) {
if (check_availability()) {
library(fuzzywuzzyR)
s1 = ' It was a dark and stormy night. I was all alone sitting on a red chair.'
s2 = ' It was a murky and stormy night. I was all alone sitting on a crimson chair.'
init = SequenceMatcher$new(string1 = s1, string2 = s2)
init$ratio()
init$quick_ratio()
init$real_quick_ratio()
init$get_matching_blocks()
init$get_opcodes()
}
}
}, silent=TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.