Creating an OCCDS ADaM

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(admiraldev)

Introduction

This article describes creating an OCCDS ADaM. Examples are currently presented and tested in the context of ADAE. However, the examples could be applied to other OCCDS ADaMs such as ADCM, ADMH, ADDV, etc.

Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.

Programming Workflow

Read in Data {#readdata}

To start, all data frames needed for the creation of ADAE should be read into the environment. This will be a company specific process. Some of the data frames needed may be AE and ADSL

For example purpose, the CDISC Pilot SDTM and ADaM datasets ---which are included in {pharmaversesdtm}--- are used.

library(admiral)
library(dplyr, warn.conflicts = FALSE)
library(pharmaversesdtm)
library(lubridate)

data("ae")
data("admiral_adsl")

ae <- convert_blanks_to_na(ae)
adsl <- admiral_adsl
ae <- filter(ae, USUBJID %in% c("01-701-1015", "01-701-1023", "01-703-1086", "01-703-1096", "01-707-1037", "01-716-1024"))

At this step, it may be useful to join ADSL to your AE domain as well. Only the ADSL variables used for derivations are selected at this step. The rest of the relevant ADSL variables would be added later.

adsl_vars <- exprs(TRTSDT, TRTEDT, TRT01A, TRT01P, DTHDT, EOSDT)

adae <- derive_vars_merged(
  ae,
  dataset_add = adsl,
  new_vars = adsl_vars,
  by = exprs(STUDYID, USUBJID)
)
dataset_vignette(
  adae,
  display_vars = exprs(
    USUBJID, AESEQ, AETERM, AESTDTC, TRTSDT,
    TRTEDT, TRT01A, TRT01P, DTHDT, EOSDT
  )
)

Derive/Impute End and Start Analysis Date/time and Relative Day {#datetime}

This part derives ASTDTM, ASTDT, ASTDY, AENDTM, AENDT, and AENDY. The function derive_vars_dtm() can be used to derive ASTDTM and AENDTM where ASTDTM could be company-specific. ASTDT and AENDT can be derived from ASTDTM and AENDTM, respectively, using function derive_vars_dtm_to_dt(). derive_vars_dy() can be used to create ASTDY and AENDY.

adae <- adae %>%
  derive_vars_dtm(
    dtc = AESTDTC,
    new_vars_prefix = "AST",
    highest_imputation = "M",
    min_dates = exprs(TRTSDT)
  ) %>%
  derive_vars_dtm(
    dtc = AEENDTC,
    new_vars_prefix = "AEN",
    highest_imputation = "M",
    date_imputation = "last",
    time_imputation = "last",
    max_dates = exprs(DTHDT, EOSDT)
  ) %>%
  derive_vars_dtm_to_dt(exprs(ASTDTM, AENDTM)) %>%
  derive_vars_dy(
    reference_date = TRTSDT,
    source_vars = exprs(ASTDT, AENDT)
  )
dataset_vignette(
  adae,
  display_vars = exprs(
    USUBJID, AESTDTC, AEENDTC, ASTDTM, ASTDT,
    ASTDY, AENDTM, AENDT, AENDY
  )
)

See also Date and Time Imputation.

Derive Durations {#duration}

The function derive_vars_duration() can be used to create the variables ADURN and ADURU.

adae <- adae %>%
  derive_vars_duration(
    new_var = ADURN,
    new_var_unit = ADURU,
    start_date = ASTDT,
    end_date = AENDT
  )
dataset_vignette(
  adae,
  display_vars = exprs(
    USUBJID, AESTDTC, AEENDTC, ASTDT, AENDT,
    ADURN, ADURU
  )
)

Derive ATC variables {#atc}

The function derive_vars_atc() can be used to derive ATC Class Variables.

It helps to add Anatomical Therapeutic Chemical class variables from FACM to ADCM.

The expected result is the input dataset with ATC variables added.

cm <- tibble::tribble(
  ~USUBJID,       ~CMGRPID, ~CMREFID,  ~CMDECOD,
  "BP40257-1001",     "14", "1192056", "PARACETAMOL",
  "BP40257-1001",     "18", "2007001", "SOLUMEDROL",
  "BP40257-1002",     "19", "2791596", "SPIRONOLACTONE"
)
facm <- tibble::tribble(
  ~USUBJID,       ~FAGRPID,  ~FAREFID, ~FATESTCD,  ~FASTRESC,
  "BP40257-1001",      "1", "1192056", "CMATC1CD",       "N",
  "BP40257-1001",      "1", "1192056", "CMATC2CD",     "N02",
  "BP40257-1001",      "1", "1192056", "CMATC3CD",    "N02B",
  "BP40257-1001",      "1", "1192056", "CMATC4CD",   "N02BE",
  "BP40257-1001",      "1", "2007001", "CMATC1CD",       "D",
  "BP40257-1001",      "1", "2007001", "CMATC2CD",     "D10",
  "BP40257-1001",      "1", "2007001", "CMATC3CD",    "D10A",
  "BP40257-1001",      "1", "2007001", "CMATC4CD",   "D10AA",
  "BP40257-1001",      "2", "2007001", "CMATC1CD",       "D",
  "BP40257-1001",      "2", "2007001", "CMATC2CD",     "D07",
  "BP40257-1001",      "2", "2007001", "CMATC3CD",    "D07A",
  "BP40257-1001",      "2", "2007001", "CMATC4CD",   "D07AA",
  "BP40257-1001",      "3", "2007001", "CMATC1CD",       "H",
  "BP40257-1001",      "3", "2007001", "CMATC2CD",     "H02",
  "BP40257-1001",      "3", "2007001", "CMATC3CD",    "H02A",
  "BP40257-1001",      "3", "2007001", "CMATC4CD",   "H02AB",
  "BP40257-1002",      "1", "2791596", "CMATC1CD",       "C",
  "BP40257-1002",      "1", "2791596", "CMATC2CD",     "C03",
  "BP40257-1002",      "1", "2791596", "CMATC3CD",    "C03D",
  "BP40257-1002",      "1", "2791596", "CMATC4CD",   "C03DA"
)

derive_vars_atc(cm, facm)

Derive Planned and Actual Treatment {#trtpa}

TRTA and TRTP must match at least one value of the character treatment variables in ADSL (e.g., TRTxxA/TRTxxP, TRTSEQA/TRTSEQP, TRxxAGy/TRxxPGy).

An example of a simple implementation for a study without periods could be:

adae <- mutate(adae, TRTP = TRT01P, TRTA = TRT01A)

count(adae, TRTP, TRTA, TRT01P, TRT01A)

For studies with periods see the "Visit and Period Variables" vignette.

Derive Date/Date-time of Last Dose {#last_dose}

The function derive_vars_joined() can be used to derive the last dose date before the start of the event.

data(ex_single)
ex_single <- derive_vars_dtm(
  ex_single,
  dtc = EXSTDTC,
  new_vars_prefix = "EXST",
  flag_imputation = "none"
)

adae <- derive_vars_joined(
  adae,
  ex_single,
  by_vars = exprs(STUDYID, USUBJID),
  new_vars = exprs(LDOSEDTM = EXSTDTM),
  join_vars = exprs(EXSTDTM),
  order = exprs(EXSTDTM),
  filter_add = (EXDOSE > 0 | (EXDOSE == 0 & grepl("PLACEBO", EXTRT))) & !is.na(EXSTDTM),
  filter_join = EXSTDTM <= ASTDTM,
  mode = "last"
)
dataset_vignette(
  adae,
  display_vars = exprs(
    USUBJID, AEDECOD, AESEQ, AESTDTC, AEENDTC,
    ASTDT, AENDT, LDOSEDTM
  )
)

Derive Severity, Causality, and Toxicity Grade {#severity}

The variables ASEV, AREL, and ATOXGR can be added using simple dplyr::mutate() assignments, if no imputation is required.

adae <- adae %>%
  mutate(
    ASEV = AESEV,
    AREL = AEREL
  )

Derive Treatment Emergent Flag {#trtflag}

To derive the treatment emergent flag TRTEMFL, one can call derive_var_trtemfl(). In the example below, we use 30 days in the flag derivation.

adae <- adae %>%
  derive_var_trtemfl(
    trt_start_date = TRTSDT,
    trt_end_date = TRTEDT,
    end_window = 30
  )
dataset_vignette(
  adae,
  display_vars = exprs(
    USUBJID, TRTSDT, TRTEDT, AESTDTC, ASTDT,
    TRTEMFL
  )
)

To derive on-treatment flag (ONTRTFL) in an ADaM dataset with a single occurrence date, we use derive_var_ontrtfl().

The expected result is the input dataset with an additional column named ONTRTFL with a value of "Y" or NA.

If you want to also check an end date, you could add the end_date argument. Note that in this scenario you could set span_period = TRUE if you want occurrences that started prior to drug intake, and was ongoing or ended after this time to be considered as on-treatment.

bds1 <- tibble::tribble(
  ~USUBJID, ~ADT,              ~TRTSDT,           ~TRTEDT,
  "P01",    ymd("2020-02-24"), ymd("2020-01-01"), ymd("2020-03-01"),
  "P02",    ymd("2020-01-01"), ymd("2020-01-01"), ymd("2020-03-01"),
  "P03",    ymd("2019-12-31"), ymd("2020-01-01"), ymd("2020-03-01")
)
derive_var_ontrtfl(
  bds1,
  start_date = ADT,
  ref_start_date = TRTSDT,
  ref_end_date = TRTEDT
)

bds2 <- tibble::tribble(
  ~USUBJID, ~ADT,              ~TRTSDT,           ~TRTEDT,
  "P01",    ymd("2020-07-01"), ymd("2020-01-01"), ymd("2020-03-01"),
  "P02",    ymd("2020-04-30"), ymd("2020-01-01"), ymd("2020-03-01"),
  "P03",    ymd("2020-03-15"), ymd("2020-01-01"), ymd("2020-03-01")
)
derive_var_ontrtfl(
  bds2,
  start_date = ADT,
  ref_start_date = TRTSDT,
  ref_end_date = TRTEDT,
  ref_end_window = 60
)

bds3 <- tibble::tribble(
  ~ADTM,              ~TRTSDTM,           ~TRTEDTM,           ~TPT,
  "2020-01-02T12:00", "2020-01-01T12:00", "2020-03-01T12:00", NA,
  "2020-01-01T12:00", "2020-01-01T12:00", "2020-03-01T12:00", "PRE",
  "2019-12-31T12:00", "2020-01-01T12:00", "2020-03-01T12:00", NA
) %>%
  mutate(
    ADTM = ymd_hm(ADTM),
    TRTSDTM = ymd_hm(TRTSDTM),
    TRTEDTM = ymd_hm(TRTEDTM)
  )
derive_var_ontrtfl(
  bds3,
  start_date = ADTM,
  ref_start_date = TRTSDTM,
  ref_end_date = TRTEDTM,
  filter_pre_timepoint = TPT == "PRE"
)

Derive Occurrence Flags {#occflag}

The function derive_var_extreme_flag() can help derive variables such as AOCCIFL, AOCCPIFL, AOCCSIFL, and AOCCzzFL.

If grades were collected, the following can be used to flag first occurrence of maximum toxicity grade.

adae <- adae %>%
  restrict_derivation(
    derivation = derive_var_extreme_flag,
    args = params(
      by_vars = exprs(USUBJID),
      order = exprs(desc(ATOXGR), ASTDTM, AESEQ),
      new_var = AOCCIFL,
      mode = "first"
    ),
    filter = TRTEMFL == "Y"
  )

Similarly, ASEV can also be used to derive the occurrence flags, if severity is collected. In this case, the variable will need to be recoded to a numeric variable. Flag first occurrence of most severe adverse event:

adae <- adae %>%
  restrict_derivation(
    derivation = derive_var_extreme_flag,
    args = params(
      by_vars = exprs(USUBJID),
      order = exprs(
        as.integer(factor(
          ASEV,
          levels = c("DEATH THREATENING", "SEVERE", "MODERATE", "MILD")
        )),
        ASTDTM, AESEQ
      ),
      new_var = AOCCIFL,
      mode = "first"
    ),
    filter = TRTEMFL == "Y"
  )
dataset_vignette(
  adae,
  display_vars = exprs(
    USUBJID, ASTDTM, ASEV, AESEQ, TRTEMFL, AOCCIFL
  )
)

Derive Query Variables {#query}

For deriving query variables SMQzzNAM, SMQzzCD, SMQzzSC, SMQzzSCN, or CQzzNAM the derive_vars_query() function can be used. As input it expects a queries dataset, which provides the definition of the queries. See Queries dataset documentation for a detailed description of the queries dataset. The create_query_data() function can be used to create queries datasets.

The following example shows how to derive query variables for Standardized MedDRA Queries (SMQs) in ADAE.

data("queries")
dataset_vignette(queries)
adae1 <- tibble::tribble(
  ~USUBJID, ~ASTDTM, ~AETERM, ~AESEQ, ~AEDECOD, ~AELLT, ~AELLTCD,
  "01", "2020-06-02 23:59:59", "ALANINE AMINOTRANSFERASE ABNORMAL",
  3, "Alanine aminotransferase abnormal", NA_character_, NA_integer_,
  "02", "2020-06-05 23:59:59", "BASEDOW'S DISEASE",
  5, "Basedow's disease", NA_character_, 1L,
  "03", "2020-06-07 23:59:59", "SOME TERM",
  2, "Some query", "Some term", NA_integer_,
  "05", "2020-06-09 23:59:59", "ALVEOLAR PROTEINOSIS",
  7, "Alveolar proteinosis", NA_character_, NA_integer_
)

adae_query <- derive_vars_query(dataset = adae1, dataset_queries = queries)
dataset_vignette(adae_query)

Similarly to SMQ, the derive_vars_query() function can be used to derive Standardized Drug Groupings (SDG).

sdg <- tibble::tribble(
  ~PREFIX,     ~GRPNAME,       ~GRPID,   ~SCOPE, ~SCOPEN,  ~SRCVAR,     ~TERMNAME,         ~TERMID,
  "SDG01",     "Diuretics",        11,  "BROAD",       1,  "CMDECOD",   "Diuretic 1",           NA,
  "SDG01",     "Diuretics",        11,  "BROAD",       1,  "CMDECOD",   "Diuretic 2",           NA,
  "SDG02",     "Costicosteroids",  12,  "BROAD",       1,  "CMDECOD",   "Costicosteroid 1",     NA,
  "SDG02",     "Costicosteroids",  12,  "BROAD",       1,  "CMDECOD",   "Costicosteroid 2",     NA,
  "SDG02",     "Costicosteroids",  12,  "BROAD",       1,  "CMDECOD",   "Costicosteroid 3",     NA,
)
adcm <- tibble::tribble(
  ~USUBJID, ~ASTDTM,               ~CMDECOD,
  "01",     "2020-06-02 23:59:59", "Diuretic 1",
  "02",     "2020-06-05 23:59:59", "Diuretic 1",
  "03",     "2020-06-07 23:59:59", "Costicosteroid 2",
  "05",     "2020-06-09 23:59:59", "Diuretic 2"
)
adcm_query <- derive_vars_query(adcm, sdg)
dataset_vignette(adcm_query)

Add the ADSL variables {#adsl_vars}

If needed, the other ADSL variables can now be added:

adae <- adae %>%
  derive_vars_merged(
    dataset_add = select(adsl, !!!negate_vars(adsl_vars)),
    by_vars = exprs(STUDYID, USUBJID)
  )
dataset_vignette(
  adae,
  display_vars = exprs(
    USUBJID, AEDECOD, ASTDTM, DTHDT, RFSTDTC,
    RFENDTC, AGE, AGEU, SEX
  )
)

Derive Analysis Sequence Number {#aseq}

The function derive_var_obs_number() can be used for deriving ASEQ variable to ensure the uniqueness of subject records within the dataset.

For example, there can be multiple records present in ADCM for a single subject with the same ASTDTM and CMSEQ variables. But these records still differ at ATC level:

``` {r eval=TRUE, echo=TRUE} adcm <- tibble::tribble( ~USUBJID, ~ASTDTM, ~CMSEQ, ~CMDECOD, ~ATC1CD, ~ATC2CD, ~ATC3CD, ~ATC4CD, "BP40257-1001", "2013-07-05 UTC", "14", "PARACETAMOL", "N", "N02", "N02B", "N02BE", "BP40257-1001", "2013-08-15 UTC", "18", "SOLUMEDROL", "D", "D10", "D10A", "D10AA", "BP40257-1001", "2013-08-15 UTC", "18", "SOLUMEDROL", "D", "D07", "D07A", "D07AA", "BP40257-1001", "2013-08-15 UTC", "18", "SOLUMEDROL", "H", "H02", "H02A", "H02AB", "BP40257-1002", "2012-12-15 UTC", "19", "SPIRONOLACTONE", "C", "C03", "C03D", "C03DA" )

adcm_aseq <- adcm %>% derive_var_obs_number( by_vars = exprs(USUBJID), order = exprs(ASTDTM, CMSEQ, ATC1CD, ATC2CD, ATC3CD, ATC4CD), new_var = ASEQ, check_type = "error" )

```r
dataset_vignette(adcm_aseq)

Add Labels and Attributes {#attributes}

Adding labels and attributes for SAS transport files is supported by the following packages:

NOTE: All these packages are in the experimental phase, but the vision is to have them associated with an End to End pipeline under the umbrella of the pharmaverse. An example of applying metadata and perform associated checks can be found at the pharmaverse E2E example.

Example Scripts

ADaM | Sample Code ---- | -------------- ADAE | ad_adae.R{target="_blank"} ADCM | ad_adcm.R{target="_blank"}



Try the admiral package in your browser

Any scripts or data that you put into this service are public.

admiral documentation built on Oct. 19, 2023, 1:08 a.m.