Description Usage Arguments Value Examples
This is an input to the main phrase ranking function. It's included here because it may have utility as tokenizer that allows tokenization based on arbitrary tokens and puncuation. The default tokenization does not cross sentences and line breaks are treated as sentences for the purpose of tokenization.
1 2 | candidate_phrases(x, split_words = smart_stop_words(),
split_punct = basic_punct(), remove_numbers = F)
|
x |
a character vector |
split_words |
a vector of words to split your texts by. By defaults this calls a function that includes generated stop words. |
split_punct |
a vector of punctuation to use in splitting your words. By default calls a function with basic punctuation |
always returns a list with one element for each input text and phrases stored in a character vector. If the character vector is name then the names will be used throughout, otherwise this function generates sequential documents names.
1 2 3 | candidate_phrases(test_text)
candidate_phrases(test_text, c("the","and"), c(","," \\."))
candidate_phrases(test_text, NULL, " ")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.