knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The DrugUtilisation package includes a range of functions that add drug-related information of subjects in OMOP CDM tables and cohort tables. Essentially, there are two functionalities: add
and summarise
. While the first return patient-level information on drug usage, the second returns aggregate estimates of it. In this vignette, we will explore these functions and provide some examples for its usage.
For this vignette we will use mock data contained in the DrugUtilisation package. This mock dataset contains cohorts, we will take "cohort1" as the cohort table of interest from which we want to study drug usage of acetaminophen.
library(DrugUtilisation) cdm <- mockDrugUtilisation(numberIndividual = 200) cdm$cohort1 |> dplyr::glimpse()
Since we want to characterise acetaminophen and simvastatin usage for subjects in cohort1, we first have to get the codelist with CodelistGenerator:
drugConcepts <- CodelistGenerator::getDrugIngredientCodes(cdm = cdm, name = c("acetaminophen", "simvastatin"))
With the function addNumberExposures()
we can get how many exposures to acetaminophen each patient in our cohort had during a certain time. There are 2 thing to keep in mind when using this function:
Time period of interest: The indexDate
and censorDate
arguments refer to the time-period in which we are interested to compute the number of exposure to acetaminophen. The refer to date columns in the cohort table, and by default this are "cohort_start_date" and "cohort_end_date" respectively.
Incident or prevalent events? Do we want to consider only those exposures to the drug of interest starting during the time-period (restrictIncident = TRUE
), or do we also want to take into account those that started before but underwent for at least some time during the follow-up period considered (restrictIncident = FALSE
)?
In what follows we add a column in the cohort table, with the number of incident exposures during the time patients are in the cohort:
cohort <- addNumberExposures( cohort = cdm$cohort1, # cohort with the population of interest conceptSet = drugConcepts, # concepts of the drugs of interest indexDate = "cohort_start_date", censorDate = "cohort_end_date", restrictIncident = TRUE, nameStyle = "number_exposures_{concept_name}", name = NULL ) cohort |> dplyr::glimpse()
This function works like the previous one, but calculates the number of eras instead of exposures. The difference between these two is given by the gapEra
argument: consecutive drug exposures separated by less than the days specified in gapEra
, are collapsed together into the same era.
Next we compute the number of eras, considering a gap of 3 days.
Additionally, we use the argument nameStyle
so the new columns are only identified by the concept name, instead of using the prefix "number_eras_" set by default.
cohort <- addNumberEras( cohort = cdm$cohort1, # cohort with the population of interest conceptSet = drugConcepts, # concepts of the drugs of interest indexDate = "cohort_start_date", censorDate = "cohort_end_date", gapEra = 3, restrictIncident = TRUE, nameStyle = "{concept_name}", name = NULL ) cohort |> dplyr::glimpse()
This argument set to TRUE will add a column specifying the time in days a person has been exposed to the drug of interest. Take note that gapEra
and restrictIncident
will be taken into account for this calculation:
1) Drug eras: exposed time will be based on drug eras according to gapEra
.
2) Incident exposures: if restrictIncident = TRUE
, exposed time will consider only those drug exposures starting after indexDate, while if restrictIncident = FALSE
, exposures that started before indexDate and ended afterwards will also be taken into account.
The subfunction to get only this information is addDaysExposed()
.
Similarly to the previous one, this argument adds a column with the number of days the individual is prescribed with the drug of interest, if set to TRUE. This number is calculated by adding up the days for all prescriptions that contribute to the analysis. In this case, restrictIncident
will influence the calculation as follows: if set to TRUE, drug prescriptions will only be counted if happening after index date; if FALSE, all prescriptions will contribute to the sum.
The subfunction to get only this information is addDaysPrescribed()
.
If set to TRUE, a column will be added that shows the number of days until the first exposure occurring within the considered time window. Notice that the value of restrictIncident
will be taken into account: if TRUE, the time to the first incident exposure during the time interval is measured; otherwise, exposures that start before the indexDate
and end afterwards will be considered (in these cases, time to exposure is 0).
The subfunction to get only this information is addTimeToExposure()
.
This argument will add a column with information on the number of days of the first prescription of the drug. If restrictIncident = TRUE
, this first drug exposure record after index date will be selected. Otherwise, the first record ever will be the one contributing this number.
The subfunction to get only this information is addInitialExposureDuration()
.
These, if TRUE, will add a column each specifying which was the initial quantity prescribed at the start of the first exposure considered (initialQuantity
), and the cumulative quantity taken throughout the exposures in the considered time-window (cumulativeQuantity
).
Quantities are measured at conceptSet level not ingredient. Notice that for both measures restrictIncident
is considered, while gapEra
is used for the cumulative quantity
.
The subfunctions to get this information are addInitialQuantity()
and addCumulativeQuantity()
respectively.
If initialDailyDose
is TRUE, a column will be add specifying for each of the ingredients in a conceptSet which was the initial daily dose given. The cumulativeDose
will measure for each ingredient the total dose taken throughout the exposures considered in the time-window. Recall that restrictIncident
is considered in these calculations, and that the cumulative dose also considers gapEra
.
The subfunctions to get this information are addInitialDailyDose()
and addCumulativeDose()
respectively.
All the explained add
functions are subfunctions of the more comprehensive addDrugUtilisation()
. This broader function computes multiple drug utilization metrics.
addDrugUtilisation( cohort, indexDate = "cohort_start_date", censorDate = "cohort_end_date", ingredientConceptId = NULL, conceptSet = NULL, restrictIncident = TRUE, gapEra = 1, numberExposures = TRUE, numberEras = TRUE, daysExposed = TRUE, daysPrescribed = TRUE, timeToExposure = TRUE, initialExposureDuration = TRUE, initialQuantity = TRUE, cumulativeQuantity = TRUE, initialDailyDose = TRUE, cumulativeDose = TRUE, nameStyle = "{value}_{concept_name}_{ingredient}", name = NULL )
Using addDrugUtilisation()
is recommended when multiple parameters are needed, as it is more computationally efficient than chaining the different subfunctions.
If conceptSet
is NULL, it will be produced from descendants of given ingredients.
nameStyle
argument allows customisation of the names of the new columns added by the function, following the glue package style.
By default it returns a temporal table, but if name
is not NULL a permanent table with the defined name will be computed in the database.
In what follows we create a permanent table "drug_utilisation_example" in the database with the information on dosage and quantity of the ingredients 1125315 (acetaminophen) and 1503297 (metformin). We are interested in exposures happening from cohort end date, until the end of the patient's observation data. Additionally, we define an exposure era using a gap of 7 days, and we only consider incident exposures during that time.
cdm$drug_utilisation_example <- cdm$cohort1 |> # add end of current observation date with the package PatientProfiels PatientProfiles::addFutureObservation(futureObservationType = "date") |> # add the targeted drug utilisation measures addDrugUtilisation( indexDate = "cohort_end_date", censorDate = "future_observation", ingredientConceptId = c(1125315, 1503297), conceptSet = NULL, restrictIncident = TRUE, gapEra = 7, numberExposures = FALSE, numberEras = FALSE, daysExposed = FALSE, daysPrescribed = FALSE, timeToExposure = FALSE, initialExposureDuration = FALSE, initialQuantity = TRUE, cumulativeQuantity = TRUE, initialDailyDose = TRUE, cumulativeDose = TRUE, nameStyle = "{value}_{concept_name}_{ingredient}", name = "drug_utilisation_example" ) cdm$drug_utilisation_example |> dplyr::glimpse()
The information given by addDrugUtilisation()
or its sub-functions is at patient level. If we are interested in aggregated estimates for these measure we can use summariseDrugUtilisation()
.
This function will provide the desired estimates (set in the argument estimates
) of the targeted drug utilisation measures. Similar to addDrugUtilisation()
, by setting TRUE or FALSE each of the drug utilisation measures, the user can choose which measures to obtain.
duResults <- summariseDrugUtilisation( cohort = cdm$cohort1, strata = list(), estimates = c( "q25", "median", "q75", "mean", "sd", "count_missing", "percentage_missing" ), indexDate = "cohort_start_date", censorDate = "cohort_end_date", ingredientConceptId = c(1125315, 1503297), conceptSet = NULL, restrictIncident = TRUE, gapEra = 7, numberExposures = TRUE, numberEras = TRUE, daysExposed = TRUE, daysPrescribed = TRUE, timeToExposure = TRUE, initialExposureDuration = TRUE, initialQuantity = TRUE, cumulativeQuantity = TRUE, initialDailyDose = TRUE, cumulativeDose = TRUE ) duResults |> dplyr::glimpse()
As seen below, the result of this function is a summarised_result
object. For more information on these class of objects see omopgenerics
package.
Additionally, the strata
argument will provide the estimates for different stratifications defined by columns in the cohort. For instance, we can add a column indicating the sex, and another indicating if the subject is older than 50, and use those to stratify by sex and age, together and separately as follows:
duResults <- cdm$cohort1 |> # add age and sex PatientProfiles::addDemographics( age = TRUE, ageGroup = list("<=50" = c(0, 50), ">50" = c(51, 150)), sex = TRUE, priorObservation = FALSE, futureObservation = FALSE ) |> # drug utilisation summariseDrugUtilisation( strata = list("age_group", "sex", c("age_group", "sex")), estimates = c("mean", "sd", "count_missing", "percentage_missing"), indexDate = "cohort_start_date", censorDate = "cohort_end_date", ingredientConceptId = c(1125315, 1503297), conceptSet = NULL, restrictIncident = TRUE, gapEra = 7, numberExposures = TRUE, numberEras = TRUE, daysExposed = TRUE, daysPrescribed = TRUE, timeToExposure = TRUE, initialExposureDuration = TRUE, initialQuantity = TRUE, cumulativeQuantity = TRUE, initialDailyDose = TRUE, cumulativeDose = TRUE ) duResults |> dplyr::glimpse()
The estimates obtained in this last part correspond to the mean (mean
) and standard deviation (sd
) of those that had information on dose and quantity, and the number (count_missing
) (and percentage (percentage_missing
)) of subjects with missing information.
Results from summariseDrugUtilisation()
can be nicely visualised in a tabular format using the function tableDrugUtilisation()
.
tableDrugUtilisation(duResults)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.