text2vocab: Convert a vector of texts into a vocabulary object

Description Usage Arguments

View source: R/text2vocab.R

Description

Convert a vector of texts into a vocabulary object

Usage

1
2
3
4
5
6
7
text2vocab(
  it,
  documentCount,
  ngram = c(1L, 3L),
  documentMinimum = 10,
  stopWords = stopwords()
)

Arguments

it

a text2vec token iterator

documentCount

count of documents in dataset

ngram

a list specifying the min/max for ngrams

documentMinimum

a word must appear in no less than n documents

stopWords

words to eliminate from the vocabulary object (use character(0) for none)


John-Poplett/novels documentation built on Jan. 28, 2020, 12:02 a.m.