View source: R/text_preprocessing.R
extract_entities | R Documentation |
This function extracts entities from text and optionally assigns them to specific semantic categories based on dictionaries.
extract_entities(
text_data,
text_column = "abstract",
dictionary = NULL,
case_sensitive = FALSE,
overlap_strategy = c("priority", "all", "longest"),
sanitize_dict = TRUE
)
text_data |
A data frame containing article text data. |
text_column |
Name of the column containing text to process. |
dictionary |
Combined dictionary or list of dictionaries for entity extraction. |
case_sensitive |
Logical. If TRUE, matching is case-sensitive. |
overlap_strategy |
How to handle terms that match multiple dictionaries: "priority", "all", or "longest". |
sanitize_dict |
Logical. If TRUE, sanitizes the dictionary before extraction. |
A data frame with extracted entities, their types, and positions.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.