str_find | R Documentation |
This function finds the element indices of partial matching or similar strings in a character vector. Can be used to find exact or slightly mistyped elements in a string vector.
str_find(string, pattern, precision = 2, partial = 0, verbose = FALSE)
string |
Character vector with string elements. |
pattern |
String that should be matched against the elements of |
precision |
Maximum distance ("precision") between two string elements, which is allowed to treat them as similar or equal. Smaller values mean less tolerance in matching. |
partial |
Activates similar matching (close distance strings) for parts (substrings)
of the
Default value is 0. See 'Details' for more information. |
verbose |
Logical; if |
Computation Details
Fuzzy string matching is based on regular expressions, in particular
grep(pattern = "(<pattern>){~<precision>}", x = string)
. This
means, precision
indicates the number of chars inside pattern
that may differ in string
to cosinder it as "matching". The higher
precision
is, the more tolerant is the search (i.e. yielding more
possible matches). Furthermore, the higher the value for partial
is, the more matches may be found.
Partial Distance Matching
For partial = 1
, a substring of length(pattern)
is extracted
from string
, starting at position 0 in string
until
the end of string
is reached. Each substring is matched against
pattern
, and results with a maximum distance of precision
are considered as "matching". If partial = 2
, the range
of the extracted substring is increased by 2, i.e. the extracted substring
is two chars longer and so on.
A numeric vector with index position of elements in string
that
partially match or are similar to pattern
. Returns -1
if no
match was found.
This function does not return the position of a matching string inside
another string, but the element's index of the string
vector, where
a (partial) match with pattern
was found. Thus, searching for "abc" in
a string "this is abc" will not return 9 (the start position of the substring),
but 1 (the element index, which is always 1 if string
only has one element).
group_str
string <- c("Hello", "Helo", "Hole", "Apple", "Ape", "New", "Old", "System", "Systemic")
str_find(string, "hel") # partial match
str_find(string, "stem") # partial match
str_find(string, "R") # no match
str_find(string, "saste") # similarity to "System"
# finds two indices, because partial matching now
# also applies to "Systemic"
str_find(string,
"sytsme",
partial = 1)
# finds partial matching of similarity
str_find("We are Sex Pistols!", "postils")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.