This vignette provides a brief introduction to the PRE2DUPR package, which is designed to construct treatment periods from drug purhchases data with PRE2DUP algorithm. The package includes functions for validating data and running the PRE2DUP.

Installation

To install the PRE2DUPR package, you can use the following command in R:

install.packages("devtools")
devtools::install_github("piavat/PRE2DUP-R")

To use the PRE2DUPR package, you can start by loading it into your R session:

library(PRE2DUPR)

Data

The PRE2DUPR package comes with example datasets that you can use to test the functionality of the PRE2DUP algorithm. The datasets include:

All data types have associated functions to validate the input before running pre2dup. These functions are called internally by the program, so you don't need to run them manually unless you want to check your data beforehand.

It is recommended to run these checks in advance to detect and correct errors more easily and efficiently. Note that the internal checks in pre2dup will display only the first five rows with detected errors. When run separately, all rows with issues can be listed by adjusting the function parameter print_all = TRUE.

Drug purchases data

Drug purchases are records with information about the purchase of drugs, including the person who made the purchase, the drug's ATC code, the package ID, the date of purchase, the number of packages purchased, and the amount in DDDs (Defined Daily Doses).

Data validation

Function check_purchases checks the data before running the PRE2DUP algorithm. It ensures that the dataset meets the necessary requirements for the algorithm to function correctly.

check_purchases(dt = purchases_example, 
                pre_person_id = "id",
                pre_atc = "ATC",
                pre_package_id = "vnr",
                pre_date = "purchase_date",
                pre_ratio = "n_packages",
                pre_ddd = "amount",
                print_all = TRUE)

Checks passed for ‘purchases_example’

Hospitalizations data

Hospitalizations are records of hospital admissions, including the person ID, admission date, and discharge date. This data is used to assess the impact of hospitalizations on drug exposure periods.

Data validation

Function check_hospitalizations checks the data before running the PRE2DUP algorithm.

check_hospitalizations(dt = hospitalizations_example,
                       hosp_person_id = "id",
                       hosp_admission = "hospital_start",
                       hosp_discharge = "hospital_end",
                       print_all = TRUE)

Checks passed for ‘hospitalizations_example’

Package parameters

Package parameters are used to define the characteristics of drug packages. The parameter file specifies the identifying number, ATC code, and the minimum, usual, and maximum duration of a package, as well as the usual and minimum dose in defined daily doses (DDDs).

Intruction show to create package parameters Package Parameters tutorial.

Data validation

Function check_package_parameters checks the data before running the PRE2DUP algorithm.

check_package_parameters(dt = package_parameters_example, 
                         pack_atc = "ATC",
                         pack_id = "vnr",
                         pack_ddd_low = "lower_ddd", 
                         pack_ddd_usual = "usual_ddd",
                         pack_dur_min = "minimum_dur",
                         pack_dur_usual = "usual_dur", 
                         pack_dur_max = "maximum_dur",
                         print_all = FALSE)

Checks passed for ‘package_parameters_example’

ATC parameters

ATC parameters are used to define the characteristics of ATC codes when package-specific information is not available. The ATC parameters file specifies the partial or full ATC code, the lower limit of daily dose, the usual daily dose, and the minimum and maximum allowed treatment durations. Package example data ATC_parameters can be used as such or as an example of how to create your own ATC code characteristics dataset.

Data validation

Function check_atc_parameters checks the ATC parameters data before running the PRE2DUP algorithm.

check_atc_parameters(dt = ATC_parameters,
                     atc_class = "partial_atc",
                     atc_ddd_low = "lower_ddd_atc",
                     atc_ddd_usual = "usual_ddd_atc", 
                     atc_dur_min = "minimum_dur_atc",
                     atc_dur_max = "maximum_dur_atc",
                     print_all = TRUE)

Checks passed for ‘ATC_parameters’.

Running the PRE2DUP

The PRE2DUP algorithm for creation of drug use periods is run using the pre2dup function. This function will process your drug purchase data, hospitalizations, package parameters, and ATC parameters to estimate drug exposure.

pre2dup(
  pre_data = purchases_example,
  pre_person_id = "id",
  pre_atc = "ATC",
  pre_package_id = "vnr",
  pre_date = "purchase_date",
  pre_ratio = "n_packages",
  pre_ddd = "amount",
  package_parameters = package_parameters_example,
  pack_atc = "ATC",
  pack_id = "vnr",
  pack_ddd_low = "lower_ddd",
  pack_ddd_usual ="usual_ddd",
  pack_dur_min = "minimum_dur",
  pack_dur_usual = "usual_dur",
  pack_dur_max = "maximum_dur",
  atc_parameters = ATC_parameters,
  atc_class = "partial_atc",
  atc_ddd_low = "lower_ddd_atc",
  atc_ddd_usual = "usual_ddd_atc",
  atc_dur_min = "minimum_dur_atc",
  atc_dur_max = "maximum_dur_atc",
  hosp_data = hospitalizations_example,
  hosp_person_id = "id",
  hosp_admission = "hospital_start",
  hosp_discharge = "hospital_end",
  date_range = c("2025-01-01", "2025-12-31"),
  global_gap_max = 300,
  global_min = 5,
  global_max = 300,
  global_max_single = 150,
  global_ddd_high = 10,
  global_hosp_max = 30,
  weight_past = 1,
  weight_current = 4,
  weight_next = 1,
  weight_first_last = 5,
  calculate_pack_dur_usual = T,
  days_covered = 5,
  post_process_perc = 1)

Step 1/6: Checking parameters and datasets...
Checks passed for ‘pre_data’
Checks passed for ‘package_parameters’
Checks passed for ‘atc_parameters’.
Checks passed for ‘hosp_data’
Step 2/6: Calculating purchase durations...
Step 3/6: Stockpiling assessment...
Step 4/6: Calculating common package durations in data...
Refill lengths couldn’t be re-estimated, probably due to too small data size.
Step 5/6: Preparing drug use periods...
Step 6/6: Post-processing drug use periods...
Current post processing percentage: 1
Drug use periods calculated. 7 periods created for 5 persons.
$periods
   period     id     ATC  dup_start    dup_end dup_days dup_hospital_days dup_n_purchases dup_last_purchase dup_total_DDD dup_temporal_average_DDDs
    <int> <fctr>  <char>     <Date>     <Date>    <num>             <num>           <int>            <Date>         <num>                     <num>
1:      1      1 N05AH02 2025-01-01 2025-04-14      104                 0               3        2025-03-08         99.99                     0.961
2:      2      2 N05AH02 2025-01-15 2025-04-28      104                 5               3        2025-03-22         99.99                     0.961
3:      3      3 N05AH02 2025-02-01 2025-05-15      104                 0               3        2025-04-08         99.99                     0.961
4:      4      3 N05AH04 2025-01-05 2025-08-26      233                 0               2        2025-04-15        200.00                     0.858
5:      5      4 N05AH02 2025-01-10 2025-04-23      104                 0               3        2025-03-17         99.99                     0.961
6:      6      4 N05AH04 2025-01-20 2025-09-10      233                 0               2        2025-04-30        200.00                     0.858
7:      7      5 N05AH04 2025-01-01 2025-08-22      233                38               2        2025-04-11        200.00                     0.858

$pack_info
NULL

Workflow when using estimated usual package durations from data

The pre2dup function has an option to estimate the usual package durations from the data. This is useful when you want to derive package durations based on the actual purchase patterns in your dataset. Set calculate_pack_dur_usual = T, run the program and update package parameters based on common duration in data.

# Make data so big, that it can calculate common durations
id <- sort(rep(1:5, each = 20))
vnr <- rep(c(rep(30627, 10), rep(41738, 10)), 5)
atc <- rep(c(rep("N05AH02", 10), rep("N05AH04", 10)), 5)
d40 <- as.Date("2020-01-01")  + 40*1:10
d120 <- as.Date("2022-01-01")  + 120*1:10
dates <- rep(c(d40, d120), 5)
ddds <- rep(c(rep(33, 10), rep(80, 10)), 5)
ratio <- rep(1, 100)
purchases_data <- data.frame(id, vnr, atc, dates, ddds, ratio)

# This example uses function default values, they can be changed to fit your needs.
outdata <-pre2dup(
  pre_data = purchases_data,
  pre_person_id = "id",
  pre_atc = "atc",
  pre_package_id = "vnr",
  pre_date = "dates",
  pre_ratio = "ratio",
  pre_ddd = "ddds",
  package_parameters = package_parameters_example,
  pack_atc = "ATC",
  pack_id = "vnr",
  pack_ddd_low = "lower_ddd",
  pack_ddd_usual ="usual_ddd",
  pack_dur_min = "minimum_dur",
  pack_dur_usual = "usual_dur",
  pack_dur_max = "maximum_dur",
  atc_parameters = ATC_parameters,
  atc_class = "partial_atc",
  atc_ddd_low = "lower_ddd_atc",
  atc_ddd_usual = "usual_ddd_atc",
  atc_dur_min = "minimum_dur_atc",
  atc_dur_max = "maximum_dur_atc",
  calculate_pack_dur_usual = T
)

Step 1/6: Checking parameters and datasets...
Checks passed for ‘pre_data’
Checks passed for ‘package_parameters’
Checks passed for ‘atc_parameters’.
Step 2/6: Calculating purchase durations...
Step 3/6: Stockpiling assessment...
Step 4/6: Calculating common package durations in data...
Step 5/6: Preparing drug use periods...
Step 6/6: Post-processing drug use periods...
Current post processing percentage: 1
Drug use periods calculated. 10 periods created for 5 persons.

# Save updated package parameters
updated_params <- outdata$pack_info
# Check new common durations
updated_params[!is.na(updated_params$common_duration), ]
    vnr     ATC product_name strength strength_num packagesize packsize_num drug_form_harmonized ddd_per_pack minimum_dur usual_dur maximum_dur lower_ddd usual_ddd common_duration usual_duration_new
6 30627 N05AH02      LEPONEX    100MG          100         100          100               TABLET     33.33333          25     33.33         100    0.3333      1.00              40                 40
8 41738 N05AH04    KETIPINOR    300MG          300      100FOL          100               TABLET     75.00000          50    100.00         200    0.3750      0.75             120                120

# Make a new common duration column selecting the common duration from the updated package parameters by your choice
updated_params$usual_duration_new <- ifelse(
  !is.na(updated_params2$common_duration),
  updated_params2$common_duration,
  updated_params2$usual_dur
)

# Run PRE2DUP with the updated package parameters
outdata <-pre2dup(
  pre_data = purchases_data,
  pre_person_id = "id",
  pre_atc = "atc",
  pre_package_id = "vnr",
  pre_date = "dates",
  pre_ratio = "ratio",
  pre_ddd = "ddds",
  package_parameters = updated_params,
  pack_atc = "ATC",
  pack_id = "vnr",
  pack_ddd_low = "lower_ddd",
  pack_ddd_usual ="usual_ddd",
  pack_dur_min = "minimum_dur",
  pack_dur_usual = "usual_duration_new", # This is the new column
  pack_dur_max = "maximum_dur",
  atc_parameters = ATC_parameters,
  atc_class = "partial_atc",
  atc_ddd_low = "lower_ddd_atc",
  atc_ddd_usual = "usual_ddd_atc",
  atc_dur_min = "minimum_dur_atc",
  atc_dur_max = "maximum_dur_atc",
  calculate_pack_dur_usual = F # Not needed anymore, as we have updated package parameters
)
Step 1/6: Checking parameters and datasets...
Checks passed for ‘pre_data’
Checks passed for ‘package_parameters’
Checks passed for ‘atc_parameters’.
Step 2/6: Calculating purchase durations...
Step 3/6: Stockpiling assessment...
Step 4/6: Common package duration calculation was not selected in the parameters; skipping this step.
Step 5/6: Preparing drug use periods...
Step 6/6: Post-processing drug use periods...
Current post processing percentage: 1
Drug use periods calculated. 10 periods created for 5 persons.

# The final output
final_periods <- outdata$periods
final_periods
   period     id     atc  dup_start    dup_end dup_days dup_hospital_days dup_n_purchases dup_last_purchase dup_total_DDD dup_temporal_average_DDDs
     <int> <fctr>  <char>     <Date>     <Date>    <num>             <num>           <int>            <Date>         <num>                     <num>
 1:      1      1 N05AH02 2020-02-10 2021-03-23      408                 0              10        2021-02-04           330                     0.809
 2:      2      1 N05AH04 2022-05-01 2025-09-05     1224                 0              10        2025-04-15           800                     0.654
 3:      3      2 N05AH02 2020-02-10 2021-03-23      408                 0              10        2021-02-04           330                     0.809
 4:      4      2 N05AH04 2022-05-01 2025-09-05     1224                 0              10        2025-04-15           800                     0.654
 5:      5      3 N05AH02 2020-02-10 2021-03-23      408                 0              10        2021-02-04           330                     0.809
 6:      6      3 N05AH04 2022-05-01 2025-09-05     1224                 0              10        2025-04-15           800                     0.654
 7:      7      4 N05AH02 2020-02-10 2021-03-23      408                 0              10        2021-02-04           330                     0.809
 8:      8      4 N05AH04 2022-05-01 2025-09-05     1224                 0              10        2025-04-15           800                     0.654
 9:      9      5 N05AH02 2020-02-10 2021-03-23      408                 0              10        2021-02-04           330                     0.809
10:     10      5 N05AH04 2022-05-01 2025-09-05     1224                 0              10        2025-04-15           800                     0.654

For any questions or support, feel free to reach out to the package maintainers

...



piavat/PRE2DUP-R documentation built on June 11, 2025, 11:42 a.m.