Description Usage Arguments Value Examples
View source: R/litsearchr.functions.R
This function extracts n-grams from text.
1 2 3 4 5 6 7 8 9 10 | get_ngrams(
x,
n = 2,
min_freq = 1,
ngram_quantile = NULL,
stop_words,
rm_punctuation = FALSE,
preserve_chars = c("-", "_"),
language = "English"
)
|
x |
A character vector from which to extract n-grams. |
n |
Numeric: the minimum number of terms in an n-gram. |
min_freq |
Numeric: the minimum number of times an n-gram must occur to be returned. |
ngram_quantile |
Numeric: what quantile of ngrams should be retained. Defaults to 0.8; i.e. the 80th percentile of ngram frequencies. |
stop_words |
A character vector of stopwords to ignore. |
rm_punctuation |
Logical: should punctuation be removed before selecting ngrams? |
preserve_chars |
A character vector of punctuation marks to be retained if rm_punctuation is TRUE. |
language |
A string indicating the language to use for removing stopwords. |
A character vector of n-grams.
1 | get_ngrams("On the Origin of Species By Means of Natural Selection")
|
Loading required namespace: stopwords
8
"Natural Selection"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.