ngrams: Create n-grams

Description Usage Arguments Value Examples

View source: R/text.table.R

Description

Create n-grams

Usage

1
2
3
4
5
6
7
8
ngrams(
  x,
  text,
  group_by = c(),
  count_col_name = "count",
  n,
  ngram_prefix = NULL
)

Arguments

x

A text.table created by as.text.table().

text

A string, the name of the column in x to build n-grams with.

group_by

A vector of column names to group by. Doesn't work if the group by column is a list column.

count_col_name

A string, the name of the output column containing the number of times each base record appears in the group.

n

A integer, the number of grams to make.

ngram_prefix

A string, a prefix to add to the output n-gram columns.

Value

A text.table, with columns added for n-grams (the word, the count, and percent of the time the gram follows the word).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
ngrams(
as.text.table(
  x = as.data.table(
    list(
      col1 = c(
        "a",
        "b"
      ),
      col2 = c(
        tolower("The dog is nice because it picked up the newspaper."),
        tolower("The dog is extremely nice because it does the dishes.")
      )
    )
  ),
  text = "col2",
  split = " "
),
text = "col2",
group_by = "col1",
n = 2
)

textTools documentation built on Feb. 5, 2021, 5:07 p.m.

Related to ngrams in textTools...