find_ngrams: Convert a list of tokens to ngrams

Description Usage Arguments Value References Examples

View source: R/find_ngrams.R

Description

This code takes a list of text vectors, and returns a list of text vectors, including n-grams

Usage

1
find_ngrams(dat, n, verbose = FALSE)

Arguments

dat

a list of character vectors

n

the number of n-grams to compute

verbose

whether to print a progress bar

Value

a list of character vectors

References

http://stackoverflow.com/questions/16489748/converting-a-list-of-tokens-to-n-grams https://github.com/markvanderloo/stringdist/issues/39 https://gist.github.com/markvanderloo/9ae6a15f7d74a0159aec

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
find_ngrams(
  list(
    c('one'), c('sent', 'one'),
    c('this', 'is', 'sentence', 'two'),
    c('this', 'is', 'sentence', 'three', 'sentence', 'three'),
    c('finally', 'we', 'have', 'a', 'fourth', 'longer', 'sentence'),
    character(0),
    NULL,
    NA,
    NaN
  ),
  n=3,
  verbose=TRUE
)

zachmayer/r2vec documentation built on May 4, 2019, 9:05 p.m.