trinker/lemmar: Dictionary Based Lemmatization

Utilizes tokenization and dictionary lookup for lemmatization of text. Lemmatization is defined as "grouping together the inflected forms of a word so they can be analysed as a single item" (wikipedia). While dictionary lookup of tokens is not a true morphological analysis, this style of lemma replacement is fast and typically still robust for many applications.

Getting started

Package details

MaintainerTyler Rinker <[email protected]>
LicenseGPL-2
Version0.0.1
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("devtools")
library(devtools)
install_github("trinker/lemmar")
trinker/lemmar documentation built on Oct. 6, 2017, 12:16 a.m.