Description Usage Arguments Details Value Caveats References See Also Examples
The Soundex phonetic algorithms
1 2 3 | soundex(word, maxCodeLen = 4L, clean = TRUE)
refinedSoundex(word, maxCodeLen = 10L, clean = TRUE)
|
word |
string or vector of strings to encode |
maxCodeLen |
maximum length of the resulting encodings, in characters |
clean |
if |
The function soundex
phonentically encodes the given
string using the soundex algorithm. The function refinedSoundex
uses Apache's refined soundex algorithm. Both implementations are loosely
based on the Apache Commons Java editons.
The variable maxCodeLen
is the limit on how long the returned
soundex should be.
The soundex
and revisedSoundex
algorithms are only
defined for inputs over the standard English alphabet, i.e.,
"A-Z." Non-alphabetical characters are removed from the string in a
locale-dependent fashion. This strips spaces, hyphens, and numbers.
Other letters, such as "Ü," may be permissible in the current locale
but are unknown to soundex
and revisedSoundex
. For
inputs outside of its known range, the output is undefined and
NA
is returned and a warning
this thrown. If
clean
is FALSE
, soundex
and
revisedSoundex
attempts to process the strings. The default
is TRUE
.
soundex encoded character vector
The soundex
and refinedSoundex
algorithms are only
defined for inputs over the standard English alphabet, i.e.,
"A-Z." For inputs outside this range, the output is undefined.
Charles P. Bourne and Donald F. Ford, "A study of methods for systematically abbreviating English words and names," Journal of the ACM, vol. 8, no. 4 (1961), p. 538-552.
James P. Howard, II, "Phonetic Spelling Algorithm Implementations for R," Journal of Statistical Software, vol. 25, no. 8, (2020), p. 1–21, <10.18637/jss.v095.i08>.
Howard B. Newcombe, James M. Kennedy, "Record linkage: making maximum use of the discriminating power of identifying information," Communications of the ACM, vol. 5, no. 11 (1962), p. 563-566.
Other phonics:
caverphone()
,
cologne()
,
lein()
,
metaphone()
,
mra_encode()
,
nysiis()
,
onca()
,
phonex()
,
phonics()
,
rogerroot()
,
statcan()
1 2 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.