extract_entities: Extract Medication Entities From Phrase

View source: R/extract_entities.R

extract_entitiesR Documentation

Extract Medication Entities From Phrase

Description

This function searches a phrase for medication dosing entities of interest. It is called within medExtractR and generally not intended for use outside that function. The phrase argument containing text to search corresponds to an individual mention of the drug of interest.

Usage

extract_entities(
  phrase,
  p_start,
  p_stop,
  unit,
  frequency_fun = NULL,
  intaketime_fun = NULL,
  duration_fun = NULL,
  route_fun = NULL,
  strength_sep = NULL,
  ...
)

Arguments

phrase

Text to search.

p_start

Start position of phrase within original text.

p_stop

End position of phrase within original text.

unit

Unit of measurement for medication strength, e.g. ‘mg’.

frequency_fun

Function used to extract frequency.

intaketime_fun

Function used to extract intake time.

duration_fun

Function used to extract duration.

route_fun

Function used to extract route.

strength_sep

Delimiter for contiguous medication strengths.

...

Parameter settings used in extracting frequency and intake time, including additional arguments to the <entity>_fun arguments. Use frequency_dict, intaketime_dict, duration_dict, and route_dict to identify custom frequency, intake time, duration, and route dictionaries, respectively.

Details

Various medication dosing entities are extracted within this function including the following:

strength: The amount of drug in a given dosage form (i.e., tablet, capsule).
dose amount: The number of tablets, capsules, etc. taken at a given intake time.
dose strength: The total amount of drug given intake. This quantity would be equivalent to strength x dose amount, and appears similar to strength when dose amount is absent.
frequency: The number of times per day a dose is taken, e.g., “once daily” or ‘2x/day’.
intaketime: The time period of the day during which a dose is taken, e.g., ‘morning’, ‘lunch’, ‘in the pm’.
duration: How long a patient is on a drug regimen, e.g., ‘2 weeks’, ‘mid-April’, ‘another 3 days’.
route: The administration route of the drug, e.g., ‘by mouth’, ‘IV’, ‘topical’.

Note that extraction of the entities drug name, dose change, and time of last dose are not handled by the extract_entities function. Those entities are extracted separately and appended to the extract_entities output within the main medExtractR function. Strength, dose amount, and dose strength are primarily numeric quantities, and are identified using a combination of regular expressions and rule-based approaches. Frequency, intake time, route, and duration, on the other hand, use dictionaries for identification.

By default and when an argument <entity>_fun is NULL, the extract_generic function will be used to extract that entity. This function can also inherit user-defined entity dictionaries, supplied as arguments <entity>_dict to medExtractR or medExtractR_tapering (see documentation files for main function(s) for details).

The stength_sep argument is NULL by default, but can be used to identify shorthand for morning and evening doses. For example, consider the phrase “Lamotrigine 300-200” (meaning 300 mg in the morning and 200 mg in the evening). The argument strength_sep = '-' identifies the full expression 300-200 as dose strength in this phrase.

Value

data.frame with entities information. At least one row per entity is returned, using NA when no expression was found for a given entity.
The “entity” column of the output contains the formatted label for that entity, according to the following mapping.
strength: “Strength”
dose amount: “DoseAmt”
dose strength: “DoseStrength”
frequency: “Frequency”
intake time: “IntakeTime”
duration: “Duration”
route: “Route”
Sample output for the phrase “Lamotrigine 200mg bid” would look like:

entity expr
IntakeTime <NA>
Strength <NA>
DoseAmt <NA>
Route <NA>
Duration <NA>
Frequency bid;19:22
DoseStrength 200mg;13:18

Examples

note <- "Lamotrigine 25 mg tablet - 3 tablets oral twice daily"
extract_entities(note, 1, nchar(note), "mg")
# A user-defined dictionary can be used instead of the default
my_dictionary <- data.frame(c("daily", "twice daily"))
extract_entities(note, 1, 53, "mg", frequency_dict = my_dictionary)

medExtractR documentation built on June 7, 2022, 1:08 a.m.