ngrams: Compute N-Grams

Description Arguments Value Examples

View source: R/ngram.R

Description

Compute the n-grams (contiguous sub-sequences of length n) of a given sequence.

Arguments

x

a sequence (vector).

n

a positive integer giving the length of contiguous sub-sequences to be computed.

Value

a list with the computed sub-sequences.

Examples

1
2
3
4
5
6
7
s <- "The quick brown fox jumps over the lazy dog"
## Split into words:
w <- strsplit(s, " ", fixed = TRUE)[[1L]]
## Word tri-grams:
ngrams(w, 3L)
## Word tri-grams pasted together:
vapply(ngrams(w, 3L), paste, "", collapse = " ")

Example output

[[1]]
[1] "The"   "quick" "brown"

[[2]]
[1] "quick" "brown" "fox"  

[[3]]
[1] "brown" "fox"   "jumps"

[[4]]
[1] "fox"   "jumps" "over" 

[[5]]
[1] "jumps" "over"  "the"  

[[6]]
[1] "over" "the"  "lazy"

[[7]]
[1] "the"  "lazy" "dog" 

[1] "The quick brown" "quick brown fox" "brown fox jumps" "fox jumps over" 
[5] "jumps over the"  "over the lazy"   "the lazy dog"   

NLP documentation built on Oct. 23, 2020, 6:18 p.m.

Related to ngrams in NLP...