Description Usage Arguments Details Value Examples
Given a text vector, words' lemmata are returned
1 2 3 4 5 6 | lemmatizer(
rawtext,
lang = "it",
TreeTaggerPath = "C:/TreeTagger",
parallel = TRUE
)
|
rawtext |
the raw texts to lemmatize |
lang |
language of the texts. Default to "it" (Italian). It support the following languages:
|
TreeTaggerPath |
the file path of the local installation of Tree Tagger (default "C:/TreeTagger") |
parallel |
enables parallel processing to speed up the lemmatization process taking advantage of multiple cores (default TRUE). The number of cores is automatically set to all the available cores minus one |
the function is based on TreeTagger and the related R package koRpus. To install TreeTagger please refer to online documentation. Language specific files available in the following repository are also needed. The function returns the lemmata of "significant" words (nouns, names, adjectives, verbs, and adverbs) most commonly used in social science works. Also unrecognized words are returned.
a text vector with lemmata (nouns, names, adjectives, verbs, adverbs and unrecognized words)
1 2 3 4 | ## Not run:
dataframe$lemma <- lemmatizer(rawtext=dataframe$text, lang="it",
TreeTaggerPath = "C:/TreeTagger", parallel=TRUE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.