Description Usage Arguments Details Value See Also Examples
This function performs lemmatization on input text by reducing words to their base units.
1 2 3 4 5 6 |
inputText |
A character string or vector of character strings |
method |
Either 'direct' (which uses a predefined list of words and their lemmas) or 'treetagger' (which uses the software |
treetaggerDirectory |
the filepath to the location of your installation of the |
progressBar |
Show a progress bar. Defaults to TRUE. |
This function is essentially a wrapper for the treetag
function from the [koRpus]
package.
In turn, koRpus implements the TreeTagger software package (available here: https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/).
The software must be downloaded and installed on your local computer in order to use the lemmatize
function.
Once installed, the treetaggerDirectory
argument should consist of the path where the software was installed.
This function performs "lemmatization," which is one form of reducing words to their most basic units. It is more thorough than "stemming," which only removes suffixes. E.g. for the words "walked" and "dogs," both lemmatization and stemming would reduce the words to "walk" and "dog." However, stemming would ignore "ran" and "geese," while lemmatization would properly render these "run" and "goose."
A dataframe with lemmatized text, as well as columns with information about parts of speech
the treetag
function from the koRpus
package, as well as the treetagger documentation: https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/
1 2 3 4 5 6 7 8 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.