Description Usage Arguments Value Examples
View source: R/make_lemma_dictionary.R
Given a set of text strings, the function generates a dictionary of lemmas corresponding to words that are not in base form.
1 2 3 |
engine |
One of: "hunspell", "treetragger" or "lexicon". The lexicon and hunspell choices use the lexicon and hunspell packages, which may be faster than TreeTagger, have the tooling available without installing external tools but are likely less accurate. TreeTagger is likely more accurate but requires installing the TreeTagger program (http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger. |
path |
Path to the TreeTagger program if |
lang |
A character string naming the language to be used in koRpus
(treetagger) or hunspell. The default language is |
... |
A vector of texts to generate lemmas for. |
Returns a two column data.frame
with tokens and
corresponding lemmas.
1 2 3 4 5 6 7 8 9 10 | x <- c('the dirtier dog has eaten the pies',
'that shameful pooch is tricky and sneaky',
"He opened and then reopened the food bag",
'There are skies of blue and red roses too!'
)
make_lemma_dictionary(x)
## Not run:
make_lemma_dictionary(x, engine = 'treetagger')
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.