sanitize_dictionary: Enhanced sanitize dictionary function

View source: R/text_preprocessing.R

sanitize_dictionaryR Documentation

Enhanced sanitize dictionary function

Description

This function sanitizes dictionary terms to ensure they're valid for entity extraction.

Usage

sanitize_dictionary(
  dictionary,
  term_column = "term",
  type_column = "type",
  validate_types = TRUE,
  verbose = TRUE
)

Arguments

dictionary

A data frame containing dictionary terms.

term_column

The name of the column containing the terms to sanitize.

type_column

The name of the column containing entity types.

validate_types

Logical. If TRUE, validates terms against their claimed type.

verbose

Logical. If TRUE, prints information about the filtering process.

Value

A data frame with sanitized terms.


LBDiscover documentation built on June 16, 2025, 5:09 p.m.