ngram: Fast n-Gram 'Tokenization'

An n-gram is a sequence of n "words" taken, in order, from a body of text. This is a collection of utilities for creating, displaying, summarizing, and "babbling" n-grams. The 'tokenization' and "babbling" are handled by very efficient C code, which can even be built as its own standalone library. The babbler is a simple Markov chain. The package also offers a vignette with complete example 'workflows' and information about the utilities offered in the package.

Package overview README.md Guide to the ngram Package

Vignettes Man pages API and functions Files

Package details
Author	Drew Schmidt [aut, cre], Christian Heckendorf [aut]
Maintainer	Drew Schmidt <wrathematics@gmail.com>
License	BSD 2-clause License + file LICENSE
Version	3.2.3
URL	https://github.com/wrathematics/ngram
Package repository	View on CRAN
Installation	Install the latest version of this package by entering the following in R: `install.packages("ngram")`