View source: R/sem_search_corpus.R
sem_search_corpus | R Documentation |
Searches a text corpus for specified patterns, with support for parallel processing.
sem_search_corpus(
tif,
text_hierarchy = c("doc_id", "paragraph_id", "sentence_id"),
search,
context_size = 0,
is_inline = FALSE,
highlight = c("<b>", "</b>"),
cores = 1
)
tif |
A data frame or data.table containing the text corpus. |
text_hierarchy |
A character vector indicating the column(s) by which to group the data. |
search |
The search pattern or query. |
context_size |
Numeric, default 0. Specifies the context size, in sentences, around the found patterns. |
is_inline |
Logical, default FALSE. Indicates if the search should be inline. |
highlight |
A character vector of length two, default c('<b>', '</b>'). Used to highlight the found patterns in the text. |
cores |
Numeric, default 1. The number of cores to use for parallel processing. |
A data.table with the search results.
tif <- data.frame(doc_id = c('1', '1', '2'),
sentence_id = c('1', '2', '1'),
text = c("Hello world.",
"This is an example.",
"This is a party!"))
sem_search_corpus(tif, search = 'This is', text_hierarchy = c('doc_id', 'sentence_id'))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.