std_term: Sample inclusion, exclusion and blacklist sets for a MiADE...

std_termR Documentation

Sample inclusion, exclusion and blacklist sets for a MiADE CDB

Description

Returns a set of SNOMED concepts (as a SNOMEDconcept vector) which can be used to exclude findings in the MedCAT named entity recognition step, or blacklist (filter out) findings from the final output.

Usage

std_term(
  x,
  stopwords = c("the", "of", "by", "with", "to", "into", "and", "or", "both", "at", "as",
    "and/or", "in"),
  hyphens_to_space = FALSE,
  remove_stopwords = FALSE,
  remove_words_in_parentheses = FALSE
)

Arguments

x

text to standardise

stopwords

character vector of words to ignore or remove

hyphens_to_space

whether to convert hyphens to spaces (default is to remove them)

remove_stopwords

whether to remove stopwords

remove_words_in_parentheses

whether to remove words in parentheses (which are usually an alternative form or explanation for other parts of the phrase)

Details

  • exclude_irrelevant_findingssocial history (except housing problems and care needs), administrative statuses (except registered disabled) and for concept detection

  • blacklist_vague_findingsvague findings and disorders, intended to be used in the blacklist

  • blacklist_almost_all_except_diseasesalmost all findings and vague disorders, intended to be used in the blacklist

Value

standardised text with space at start and end, lower case (except for capitalised words or words with unusual capitalisation) and other options applied as needed


anoopshah/Rdiagnosislist documentation built on Oct. 18, 2024, 9:48 a.m.