pre2dup: Estimate Drug Use Periods from Drug Purchase Data

View source: R/pre2dupr.R

pre2dupR Documentation

Estimate Drug Use Periods from Drug Purchase Data

Description

Estimates drug use periods based on individual drug purchase data. Optionally, hospitalization data can be incorporated. The estimation uses package-specific and Anatomical Therapeutic Chemical (ATC) Classification code -level parameters. This function supports estimation for individuals with varied purchase patterns, including stockpiling behavior.

Usage

pre2dup(
  pre_data,
  pre_person_id,
  pre_atc,
  pre_package_id,
  pre_date,
  pre_ratio,
  pre_ddd,
  package_parameters,
  pack_atc,
  pack_id,
  pack_ddd_low,
  pack_ddd_usual,
  pack_dur_min,
  pack_dur_usual,
  pack_dur_max,
  atc_parameters,
  atc_class,
  atc_ddd_low,
  atc_ddd_usual,
  atc_dur_min,
  atc_dur_max,
  hosp_data = NULL,
  hosp_person_id = NULL,
  hosp_admission = NULL,
  hosp_discharge = NULL,
  date_range = NULL,
  global_gap_max = 300,
  global_min = 5,
  global_max = 300,
  global_max_single = 150,
  global_ddd_high = 10,
  global_hosp_max = 30,
  days_covered = 5,
  weight_past = 1,
  weight_current = 4,
  weight_next = 1,
  weight_first_last = 5,
  calculate_pack_dur_usual = FALSE,
  post_process_perc = 1
)

Arguments

pre_data

data.frame or data.table containing drug purchases.

pre_person_id

character, name of the column containing person id.

pre_atc

character, name of the column containing ATC code.

pre_package_id

character, name of the column containing package id.

pre_date

character, name of the column containing purchase date.

pre_ratio

character, name of the column containing ratio of packages purchased (e.g., number of packages).

pre_ddd

character, name of the column containing defined daily doses (DDD) of the purchase.

package_parameters

data.frame or data.table containing package parameters.

pack_atc

character, name of the column containing ATC code.

pack_id

character, name of the column containing package id.

pack_ddd_low

character, name of the column containing lower limit of daily DDD.

pack_ddd_usual

character, name of the column containing usual daily DDD.

pack_dur_min

character, name of the column containing minimum duration of the package.

pack_dur_usual

character, name of the column containing usual duration of the package.

pack_dur_max

character, name of the column containing maximum duration of the package.

atc_parameters

data.frame or data.table containing ATC parameters.

atc_class

character, name of the column containing ATC class.

atc_ddd_low

character, name of the column containing lower limit of daily DDD for the ATC class.

atc_ddd_usual

character, name of the column containing usual daily DDD for the ATC class.

atc_dur_min

character, name of the column containing minimum duration for the ATC class.

atc_dur_max

character, name of the column containing maximum duration for the ATC class.

hosp_data

data.frame or data.table containing hospitalizations.

hosp_person_id

character, name of the column containing person id.

hosp_admission

character, name of the column containing admission date.

hosp_discharge

character, name of the column containing discharge date.

date_range

character, vector of two dates, start and end of the purchase data.

global_gap_max

numeric, maximum gap between purchases, default 300..

global_min

numeric, minimum duration of a purchase, default 5.

global_max

numeric, maximum duration of a purchase, default 300.

global_max_single

numeric, maximum duration of a single purchase, default 150.

global_ddd_high

numeric, maximum daily DDD for a purchase per day for any ATC, default 10.

global_hosp_max

numeric, maximum number of hospital days to be considered when estimating the exposure duration, default 30.

days_covered

numeric, maximum number of days to be added to the exposure duration to cover the gap between purchases, default 5.

weight_past

numeric, weight for the past purchase in sliding average calculation, default 1.

weight_current

numeric, weight for the current purchase in sliding average calculation, default 4.

weight_next

numeric, weight for the next purchase in sliding average calculation, default 1.

weight_first_last

numeric, weight for the first and last purchase in sliding average calculation, default 5.

calculate_pack_dur_usual

TRUE or FALSE, re-calculate usual duration of the package based on the purchase frequency in data, default FALSE.

post_process_perc

numeric, percentage of the data to be used in post-processing, default 1.

Details

Before starting to estimate the drug use periods, the function validates the input data and arguments by checking for missing values and unacceptable duplicates. It will stop execution if such issues are detected, with the following exceptions:

  • Up to 10% of missing DDD values per ATC class in the drug purchase data is allowed.

  • Up to 10% of missing package parameter records per ATC class is allowed.

If either threshold is exceeded, the function prompts the user to decide whether to continue. If the user agrees, ATC classes with insufficient data are excluded, and the function proceeds with the remaining data.

There are five available methods for estimating the duration of each purchase, presented in the order of preference:

  • Main method: Based on purchased daily doses (DDDs), temporal average of daily DDDs, and individual purchase patterns.

  • Package DDD method: Based on purchased DDDs and the usual daily DDD for the specific package.

  • Package duration method: Based on the usual duration of the package, considering the proportion of the package purchased.

  • ATC-level DDD method: Based on purchased DDDs and usual daily DDDs at the ATC level.

  • Minimum ATC duration method: Based on the minimum duration defined for the ATC group.

Periods that are close in time can be joined in a post-processing step controlled by post_process_perc. Post processing percentage reduces by 0.1 at each estimation round to prevent very long calculation times for large datasets.

In addition to estimating drug use periods, the function can also calculate common package durations from the purchase data. These calculated durations can be used to verify and adjust the usual duration parameters of packages. After making corrections, re-run the function to recalculate drug use periods using the updated package parameters.

Value

a list of two elements. Main element is periods: a data.table with one row per drug use period, including person, ATC, period start/end dates, duration, number of purchases, and total DDD. If calculate_pack_dur_usual = TRUE, an additional element pack_info contains updated package parameter information.

See Also

Each data type has their own check functions. pre2dup runs the checks internally, but checking the validity before running the program is recommended for faster and easier error detection and handling.

check_purchases, check_hospitalizations, check_package_parameters, check_atc_parameters

Examples

period_data <-pre2dup(pre_data = purchases_example, pre_person_id = "id",
 pre_atc = "ATC", pre_package_id = "vnr", pre_date = "purchase_date",
  pre_ratio = "n_packages", pre_ddd = "amount",
   package_parameters = package_parameters_example,
    pack_atc = "ATC", pack_id = "vnr", pack_ddd_low = "lower_ddd",
     pack_ddd_usual ="usual_ddd", pack_dur_min = "minimum_dur",
      pack_dur_usual = "usual_dur", pack_dur_max = "maximum_dur",
       atc_parameters = ATC_parameters, atc_class = "partial_atc",
       atc_ddd_low = "lower_ddd_atc", atc_ddd_usual = "usual_ddd_atc",
        atc_dur_min = "minimum_dur_atc", atc_dur_max = "maximum_dur_atc",
         hosp_data = hospitalizations_example, hosp_person_id = "id",
          hosp_admission = "hospital_start", hosp_discharge = "hospital_end",
           date_range = c("2025-01-01", "2025-12-31"),
            global_gap_max = 300, global_min = 5, global_max = 300,
             global_max_single = 150, global_ddd_high = 10,
              global_hosp_max = 30, weight_past = 1, weight_current = 4,
               weight_next = 1, weight_first_last = 5,
                calculate_pack_dur_usual = TRUE,
                 days_covered = 5,
                 post_process_perc = 1)

period_data$periods


piavat/PRE2DUP-R documentation built on June 11, 2025, 11:42 a.m.