Description Usage Arguments Details Value See Also Examples
Stem a set of terms using one of the algorithms provided by the Snowball stemming library.
1 | stem_snowball(x, algorithm = "en")
|
x |
character vector of terms to stem. |
algorithm |
stemming algorithm; see ‘Details’ for the valid choices. |
Apply a Snowball stemming algorithm to a vector of input terms, x,
returning the result in a character vector of the same length with the
same names.
The algorithm argument specifies the stemming algorithm. Valid choices
include the following:
"ar" ("arabic"),
"da" ("danish"),
"de" ("german"),
"en" ("english"),
"es" ("spanish"),
"fi" ("finnish"),
"fr" ("french"),
"hu" ("hungarian"),
"it" ("italian"),
"nl" ("dutch"),
"no" ("norwegian"),
"pt" ("portuguese"),
"ro" ("romanian"),
"ru" ("russian"),
"sv" ("swedish"),
"ta" ("tamil"),
"tr" ("turkish"),
and "porter".
Setting algorithm = NULL gives a stemmer that returns its input
unchanged.
The function only stems single-word terms of kind "letter"; it leaves other inputs (multi-word terms, and terms of kind "number", "punct", and "symbol") unchanged.
The Snowball stemming library
provides the underlying implementation. The wordStem function from
the SnowballC package provides a similar interface, but that function
applies the algorithm to all input terms, regardless of the kind of the term.
A character vector the same length and names as the input, x, with
entries containing the corresponding stems.
new_stemmer, text_filter.
1 2 3 4 5 | # apply english stemming algorithm; don't stem non-letter terms
stem_snowball(c("win", "winning", "winner", "#winning"))
# compare with SnowballC, which stems all kinds, not just letter
## Not run: SnowballC::wordStem(c("win", "winning", "winner", "#winning"), "en")
|
[1] "win" "win" "winner" "#winning"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.