This vignette provides a brief introduction to the PRE2DUPR
package, which is designed to construct treatment periods from drug purhchases data with PRE2DUP algorithm. The package includes functions for validating data and running the PRE2DUP.
To install the PRE2DUPR
package, you can use the following command in R:
install.packages("devtools") devtools::install_github("piavat/PRE2DUP-R")
To use the PRE2DUPR
package, you can start by loading it into your R session:
library(PRE2DUPR)
The PRE2DUPR
package comes with example datasets that you can use to test the functionality of the PRE2DUP algorithm. The datasets include:
purchases_example
: A dataset containing drug purchase records.hospitalizations_example
: A dataset containing hospital admission records.package_parameters_example
: A dataset containing package characteristics.ATC_parameters_example
: A dataset containing ATC code characteristics. All data types have associated functions to validate the input before running pre2dup
.
These functions are called internally by the program, so you don't need to run them manually unless you want to check your data beforehand.
It is recommended to run these checks in advance to detect and correct errors more easily and efficiently. Note that the internal checks in pre2dup
will display only the first five rows with detected errors. When run separately, all rows with issues can be listed by adjusting the function parameter print_all = TRUE
.
Drug purchases are records with information about the purchase of drugs, including the person who made the purchase, the drug's ATC code, the package ID, the date of purchase, the number of packages purchased, and the amount in DDDs (Defined Daily Doses).
Function check_purchases
checks the data before running the PRE2DUP algorithm. It ensures that the dataset meets the necessary requirements for the algorithm to function correctly.
check_purchases(dt = purchases_example, pre_person_id = "id", pre_atc = "ATC", pre_package_id = "vnr", pre_date = "purchase_date", pre_ratio = "n_packages", pre_ddd = "amount", print_all = TRUE) Checks passed for ‘purchases_example’
Hospitalizations are records of hospital admissions, including the person ID, admission date, and discharge date. This data is used to assess the impact of hospitalizations on drug exposure periods.
Function check_hospitalizations
checks the data before running the PRE2DUP algorithm.
check_hospitalizations(dt = hospitalizations_example, hosp_person_id = "id", hosp_admission = "hospital_start", hosp_discharge = "hospital_end", print_all = TRUE) Checks passed for ‘hospitalizations_example’
Package parameters are used to define the characteristics of drug packages. The parameter file specifies the identifying number, ATC code, and the minimum, usual, and maximum duration of a package, as well as the usual and minimum dose in defined daily doses (DDDs).
Intruction show to create package parameters Package Parameters tutorial.
Function check_package_parameters
checks the data before running the PRE2DUP algorithm.
check_package_parameters(dt = package_parameters_example, pack_atc = "ATC", pack_id = "vnr", pack_ddd_low = "lower_ddd", pack_ddd_usual = "usual_ddd", pack_dur_min = "minimum_dur", pack_dur_usual = "usual_dur", pack_dur_max = "maximum_dur", print_all = FALSE) Checks passed for ‘package_parameters_example’
ATC parameters are used to define the characteristics of ATC codes when package-specific information is not available. The ATC parameters file specifies the partial or full ATC code, the lower limit of daily dose, the usual daily dose, and the minimum and maximum allowed treatment durations. Package example data ATC_parameters can be used as such or as an example of how to create your own ATC code characteristics dataset.
Function check_atc_parameters
checks the ATC parameters data before running the PRE2DUP algorithm.
check_atc_parameters(dt = ATC_parameters, atc_class = "partial_atc", atc_ddd_low = "lower_ddd_atc", atc_ddd_usual = "usual_ddd_atc", atc_dur_min = "minimum_dur_atc", atc_dur_max = "maximum_dur_atc", print_all = TRUE) Checks passed for ‘ATC_parameters’.
The PRE2DUP algorithm for creation of drug use periods is run using the pre2dup
function. This function will process your drug purchase data, hospitalizations, package parameters, and ATC parameters to estimate drug exposure.
pre2dup( pre_data = purchases_example, pre_person_id = "id", pre_atc = "ATC", pre_package_id = "vnr", pre_date = "purchase_date", pre_ratio = "n_packages", pre_ddd = "amount", package_parameters = package_parameters_example, pack_atc = "ATC", pack_id = "vnr", pack_ddd_low = "lower_ddd", pack_ddd_usual ="usual_ddd", pack_dur_min = "minimum_dur", pack_dur_usual = "usual_dur", pack_dur_max = "maximum_dur", atc_parameters = ATC_parameters, atc_class = "partial_atc", atc_ddd_low = "lower_ddd_atc", atc_ddd_usual = "usual_ddd_atc", atc_dur_min = "minimum_dur_atc", atc_dur_max = "maximum_dur_atc", hosp_data = hospitalizations_example, hosp_person_id = "id", hosp_admission = "hospital_start", hosp_discharge = "hospital_end", date_range = c("2025-01-01", "2025-12-31"), global_gap_max = 300, global_min = 5, global_max = 300, global_max_single = 150, global_ddd_high = 10, global_hosp_max = 30, weight_past = 1, weight_current = 4, weight_next = 1, weight_first_last = 5, calculate_pack_dur_usual = T, days_covered = 5, post_process_perc = 1) Step 1/6: Checking parameters and datasets... Checks passed for ‘pre_data’ Checks passed for ‘package_parameters’ Checks passed for ‘atc_parameters’. Checks passed for ‘hosp_data’ Step 2/6: Calculating purchase durations... Step 3/6: Stockpiling assessment... Step 4/6: Calculating common package durations in data... Refill lengths couldn’t be re-estimated, probably due to too small data size. Step 5/6: Preparing drug use periods... Step 6/6: Post-processing drug use periods... Current post processing percentage: 1 Drug use periods calculated. 7 periods created for 5 persons. $periods period id ATC dup_start dup_end dup_days dup_hospital_days dup_n_purchases dup_last_purchase dup_total_DDD dup_temporal_average_DDDs <int> <fctr> <char> <Date> <Date> <num> <num> <int> <Date> <num> <num> 1: 1 1 N05AH02 2025-01-01 2025-04-14 104 0 3 2025-03-08 99.99 0.961 2: 2 2 N05AH02 2025-01-15 2025-04-28 104 5 3 2025-03-22 99.99 0.961 3: 3 3 N05AH02 2025-02-01 2025-05-15 104 0 3 2025-04-08 99.99 0.961 4: 4 3 N05AH04 2025-01-05 2025-08-26 233 0 2 2025-04-15 200.00 0.858 5: 5 4 N05AH02 2025-01-10 2025-04-23 104 0 3 2025-03-17 99.99 0.961 6: 6 4 N05AH04 2025-01-20 2025-09-10 233 0 2 2025-04-30 200.00 0.858 7: 7 5 N05AH04 2025-01-01 2025-08-22 233 38 2 2025-04-11 200.00 0.858 $pack_info NULL
The pre2dup
function has an option to estimate the usual package durations from the data. This is useful when you want to derive package durations based on the actual purchase patterns in your dataset.
Set calculate_pack_dur_usual = T
, run the program and update package parameters based on common duration in data.
# Make data so big, that it can calculate common durations id <- sort(rep(1:5, each = 20)) vnr <- rep(c(rep(30627, 10), rep(41738, 10)), 5) atc <- rep(c(rep("N05AH02", 10), rep("N05AH04", 10)), 5) d40 <- as.Date("2020-01-01") + 40*1:10 d120 <- as.Date("2022-01-01") + 120*1:10 dates <- rep(c(d40, d120), 5) ddds <- rep(c(rep(33, 10), rep(80, 10)), 5) ratio <- rep(1, 100) purchases_data <- data.frame(id, vnr, atc, dates, ddds, ratio) # This example uses function default values, they can be changed to fit your needs. outdata <-pre2dup( pre_data = purchases_data, pre_person_id = "id", pre_atc = "atc", pre_package_id = "vnr", pre_date = "dates", pre_ratio = "ratio", pre_ddd = "ddds", package_parameters = package_parameters_example, pack_atc = "ATC", pack_id = "vnr", pack_ddd_low = "lower_ddd", pack_ddd_usual ="usual_ddd", pack_dur_min = "minimum_dur", pack_dur_usual = "usual_dur", pack_dur_max = "maximum_dur", atc_parameters = ATC_parameters, atc_class = "partial_atc", atc_ddd_low = "lower_ddd_atc", atc_ddd_usual = "usual_ddd_atc", atc_dur_min = "minimum_dur_atc", atc_dur_max = "maximum_dur_atc", calculate_pack_dur_usual = T ) Step 1/6: Checking parameters and datasets... Checks passed for ‘pre_data’ Checks passed for ‘package_parameters’ Checks passed for ‘atc_parameters’. Step 2/6: Calculating purchase durations... Step 3/6: Stockpiling assessment... Step 4/6: Calculating common package durations in data... Step 5/6: Preparing drug use periods... Step 6/6: Post-processing drug use periods... Current post processing percentage: 1 Drug use periods calculated. 10 periods created for 5 persons. # Save updated package parameters updated_params <- outdata$pack_info # Check new common durations updated_params[!is.na(updated_params$common_duration), ] vnr ATC product_name strength strength_num packagesize packsize_num drug_form_harmonized ddd_per_pack minimum_dur usual_dur maximum_dur lower_ddd usual_ddd common_duration usual_duration_new 6 30627 N05AH02 LEPONEX 100MG 100 100 100 TABLET 33.33333 25 33.33 100 0.3333 1.00 40 40 8 41738 N05AH04 KETIPINOR 300MG 300 100FOL 100 TABLET 75.00000 50 100.00 200 0.3750 0.75 120 120 # Make a new common duration column selecting the common duration from the updated package parameters by your choice updated_params$usual_duration_new <- ifelse( !is.na(updated_params2$common_duration), updated_params2$common_duration, updated_params2$usual_dur ) # Run PRE2DUP with the updated package parameters outdata <-pre2dup( pre_data = purchases_data, pre_person_id = "id", pre_atc = "atc", pre_package_id = "vnr", pre_date = "dates", pre_ratio = "ratio", pre_ddd = "ddds", package_parameters = updated_params, pack_atc = "ATC", pack_id = "vnr", pack_ddd_low = "lower_ddd", pack_ddd_usual ="usual_ddd", pack_dur_min = "minimum_dur", pack_dur_usual = "usual_duration_new", # This is the new column pack_dur_max = "maximum_dur", atc_parameters = ATC_parameters, atc_class = "partial_atc", atc_ddd_low = "lower_ddd_atc", atc_ddd_usual = "usual_ddd_atc", atc_dur_min = "minimum_dur_atc", atc_dur_max = "maximum_dur_atc", calculate_pack_dur_usual = F # Not needed anymore, as we have updated package parameters ) Step 1/6: Checking parameters and datasets... Checks passed for ‘pre_data’ Checks passed for ‘package_parameters’ Checks passed for ‘atc_parameters’. Step 2/6: Calculating purchase durations... Step 3/6: Stockpiling assessment... Step 4/6: Common package duration calculation was not selected in the parameters; skipping this step. Step 5/6: Preparing drug use periods... Step 6/6: Post-processing drug use periods... Current post processing percentage: 1 Drug use periods calculated. 10 periods created for 5 persons. # The final output final_periods <- outdata$periods final_periods period id atc dup_start dup_end dup_days dup_hospital_days dup_n_purchases dup_last_purchase dup_total_DDD dup_temporal_average_DDDs <int> <fctr> <char> <Date> <Date> <num> <num> <int> <Date> <num> <num> 1: 1 1 N05AH02 2020-02-10 2021-03-23 408 0 10 2021-02-04 330 0.809 2: 2 1 N05AH04 2022-05-01 2025-09-05 1224 0 10 2025-04-15 800 0.654 3: 3 2 N05AH02 2020-02-10 2021-03-23 408 0 10 2021-02-04 330 0.809 4: 4 2 N05AH04 2022-05-01 2025-09-05 1224 0 10 2025-04-15 800 0.654 5: 5 3 N05AH02 2020-02-10 2021-03-23 408 0 10 2021-02-04 330 0.809 6: 6 3 N05AH04 2022-05-01 2025-09-05 1224 0 10 2025-04-15 800 0.654 7: 7 4 N05AH02 2020-02-10 2021-03-23 408 0 10 2021-02-04 330 0.809 8: 8 4 N05AH04 2022-05-01 2025-09-05 1224 0 10 2025-04-15 800 0.654 9: 9 5 N05AH02 2020-02-10 2021-03-23 408 0 10 2021-02-04 330 0.809 10: 10 5 N05AH04 2022-05-01 2025-09-05 1224 0 10 2025-04-15 800 0.654
For any questions or support, feel free to reach out to the package maintainers
...
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.