View source: R/01-constructor.r
ngram | R Documentation |
The ngram()
function is the main workhorse of this package. It takes
an input string and converts it into the internal n-gram representation.
ngram(str, n = 2, sep = " ")
str |
The input text. |
n |
The 'n' as in 'n-gram'. |
sep |
A set of separator characters for the "words". See details for
information about how this works; it works a little differently
from |
On evaluation, a copy of the input string is produced and stored as an external pointer. This is necessary because the internal list representation just points to the first char of each word in the input string. So if you (or R's gc) deletes the input string, basically all hell breaks loose.
The sep
parameter splits at any of the characters in
the string. So sep=", "
splits at a comma or a space.
An ngram
class object.
ngram-class
, getters
,
phrasetable
, babble
library(ngram)
str = "A B A C A B B"
ngram(str, n=2)
str = "A,B,A,C A B B"
### Split at a space
print(ngram(str), output="full")
### Split at a comma
print(ngram(str, sep=","), output="full")
### Split at a space or a comma
print(ngram(str, sep=", "), output="full")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.