sample_text: Sample texts from a predefined text source
In nproellochs/textsampler: Text sampling in R

Description Usage Arguments Value Examples

Performs text sampling. Requires input data in the form of raw texts.

sample_text(n = 1, source = "yelp_sentences", type = "sentences",
  sub_token = "words", max_length = 50, min_length = 1,
  word_list = NULL, shuffle = T, input = NULL, tbl = T,
  clean = T, ...)

`n`	Number of texts to be sampled. `n` is an integer greater than 0. By default, `n` is set to 1.
`source`	Text source. A vector of characters, a `data.frame`, or an object of type `Corpus`. Alternatively, one can load a predefined dataset by specifiying a string. In the latter case, possible values are `imdb_sentences`, `amazon_sentences`, `yelp_sentences` and `english_words`.
`type`	Type of texts to be sampled. Possible values are texts, paragraphs, sentences, words, and characters.
`sub_token`	A string specifying the text unit for filtering texts by length via `min_length` and `max_length`. Possible values are texts, paragraphs, sentences, words, and characters.
`max_length`	Maximum length of the texts to be sampled. `max_length` is an integer greater than 0. By default, `max_length` is set to 1.
`min_length`	Minimum length of the texts to be sampled. `min_length` is an integer greater than 0. By default, `min_length` is set to 1.
`word_list`	A word list.
`shuffle`	If `true`, the text samples are returned in random order. Default is `true`.
`input`	A string defining the column name of the raw text data in `source`. The value is ignored if `source` is not of type `dataframe`.
`tbl`	If `true`, the output is returned as a tibble. Default: `true`.
`clean`	If `true`, the texts are cleaned before text sampling. Default is `true`.
`...`	Additional parameters passed to function for e.g. preprocessing.