knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(admiraldev)
This article describes creating an OCCDS ADaM. Examples are currently
presented and tested in the context of ADAE
.
However, the examples could be applied to other OCCDS ADaMs such as
ADCM
, ADMH
, ADDV
, etc.
Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.
To start, all data frames needed for the creation of ADAE
should be read into
the environment. This will be a company specific process. Some of the data
frames needed may be AE
and ADSL
For example purpose, the CDISC Pilot SDTM and ADaM datasets ---which are
included in {pharmaversesdtm}
--- are used.
library(admiral) library(dplyr, warn.conflicts = FALSE) library(pharmaversesdtm) library(lubridate) ae <- pharmaversesdtm::ae adsl <- admiral::admiral_adsl ex_single <- admiral::ex_single ae <- convert_blanks_to_na(ae)
ae <- filter(ae, USUBJID %in% c("01-701-1015", "01-701-1023", "01-703-1086", "01-703-1096", "01-707-1037", "01-716-1024"))
At this step, it may be useful to join ADSL
to your AE
domain as well. Only the
ADSL
variables used for derivations are selected at this step. The rest of the
relevant ADSL
variables would be added later.
adsl_vars <- exprs(TRTSDT, TRTEDT, TRT01A, TRT01P, DTHDT, EOSDT) adae <- derive_vars_merged( ae, dataset_add = adsl, new_vars = adsl_vars, by = exprs(STUDYID, USUBJID) )
dataset_vignette( adae, display_vars = exprs( USUBJID, AESEQ, AETERM, AESTDTC, TRTSDT, TRTEDT, TRT01A, TRT01P, DTHDT, EOSDT ) )
This part derives ASTDTM
, ASTDT
, ASTDY
, AENDTM
, AENDT
, and AENDY
.
The function derive_vars_dtm()
can be used to derive ASTDTM
and AENDTM
where ASTDTM
could be company-specific. ASTDT
and AENDT
can be derived
from ASTDTM
and AENDTM
, respectively, using function derive_vars_dtm_to_dt()
.
derive_vars_dy()
can be used to create ASTDY
and AENDY
.
adae <- adae %>% derive_vars_dtm( dtc = AESTDTC, new_vars_prefix = "AST", highest_imputation = "M", min_dates = exprs(TRTSDT) ) %>% derive_vars_dtm( dtc = AEENDTC, new_vars_prefix = "AEN", highest_imputation = "M", date_imputation = "last", time_imputation = "last", max_dates = exprs(DTHDT, EOSDT) ) %>% derive_vars_dtm_to_dt(exprs(ASTDTM, AENDTM)) %>% derive_vars_dy( reference_date = TRTSDT, source_vars = exprs(ASTDT, AENDT) )
dataset_vignette( adae, display_vars = exprs( USUBJID, AESTDTC, AEENDTC, ASTDTM, ASTDT, ASTDY, AENDTM, AENDT, AENDY ) )
See also Date and Time Imputation.
The function derive_vars_duration()
can be used to create the variables
ADURN
and ADURU
.
adae <- adae %>% derive_vars_duration( new_var = ADURN, new_var_unit = ADURU, start_date = ASTDT, end_date = AENDT )
dataset_vignette( adae, display_vars = exprs( USUBJID, AESTDTC, AEENDTC, ASTDT, AENDT, ADURN, ADURU ) )
The function derive_vars_atc()
can be used to derive
ATC Class Variables.
It helps to add Anatomical Therapeutic Chemical class variables from FACM
to ADCM
.
The expected result is the input dataset with ATC variables added.
cm <- tibble::tribble( ~STUDYID, ~USUBJID, ~CMGRPID, ~CMREFID, ~CMDECOD, "STUDY01", "BP40257-1001", "14", "1192056", "PARACETAMOL", "STUDY01", "BP40257-1001", "18", "2007001", "SOLUMEDROL", "STUDY01", "BP40257-1002", "19", "2791596", "SPIRONOLACTONE" ) facm <- tibble::tribble( ~STUDYID, ~USUBJID, ~FAGRPID, ~FAREFID, ~FATESTCD, ~FASTRESC, "STUDY01", "BP40257-1001", "1", "1192056", "CMATC1CD", "N", "STUDY01", "BP40257-1001", "1", "1192056", "CMATC2CD", "N02", "STUDY01", "BP40257-1001", "1", "1192056", "CMATC3CD", "N02B", "STUDY01", "BP40257-1001", "1", "1192056", "CMATC4CD", "N02BE", "STUDY01", "BP40257-1001", "1", "2007001", "CMATC1CD", "D", "STUDY01", "BP40257-1001", "1", "2007001", "CMATC2CD", "D10", "STUDY01", "BP40257-1001", "1", "2007001", "CMATC3CD", "D10A", "STUDY01", "BP40257-1001", "1", "2007001", "CMATC4CD", "D10AA", "STUDY01", "BP40257-1001", "2", "2007001", "CMATC1CD", "D", "STUDY01", "BP40257-1001", "2", "2007001", "CMATC2CD", "D07", "STUDY01", "BP40257-1001", "2", "2007001", "CMATC3CD", "D07A", "STUDY01", "BP40257-1001", "2", "2007001", "CMATC4CD", "D07AA", "STUDY01", "BP40257-1001", "3", "2007001", "CMATC1CD", "H", "STUDY01", "BP40257-1001", "3", "2007001", "CMATC2CD", "H02", "STUDY01", "BP40257-1001", "3", "2007001", "CMATC3CD", "H02A", "STUDY01", "BP40257-1001", "3", "2007001", "CMATC4CD", "H02AB", "STUDY01", "BP40257-1002", "1", "2791596", "CMATC1CD", "C", "STUDY01", "BP40257-1002", "1", "2791596", "CMATC2CD", "C03", "STUDY01", "BP40257-1002", "1", "2791596", "CMATC3CD", "C03D", "STUDY01", "BP40257-1002", "1", "2791596", "CMATC4CD", "C03DA" ) derive_vars_atc(cm, dataset_facm = facm, id_vars = exprs(FAGRPID))
TRTA
and TRTP
must match at least one value of the character treatment
variables in ADSL (e.g., TRTxxA
/TRTxxP
, TRTSEQA
/TRTSEQP
,
TRxxAGy
/TRxxPGy
).
An example of a simple implementation for a study without periods could be:
adae <- mutate(adae, TRTP = TRT01P, TRTA = TRT01A) count(adae, TRTP, TRTA, TRT01P, TRT01A)
For studies with periods see the "Visit and Period Variables" vignette.
The function derive_vars_joined()
can be used to derive the last dose date before the start of the event.
ex_single <- derive_vars_dtm( ex_single, dtc = EXSTDTC, new_vars_prefix = "EXST", flag_imputation = "none" ) adae <- derive_vars_joined( adae, ex_single, by_vars = exprs(STUDYID, USUBJID), new_vars = exprs(LDOSEDTM = EXSTDTM), join_vars = exprs(EXSTDTM), join_type = "all", order = exprs(EXSTDTM), filter_add = (EXDOSE > 0 | (EXDOSE == 0 & grepl("PLACEBO", EXTRT))) & !is.na(EXSTDTM), filter_join = EXSTDTM <= ASTDTM, mode = "last" )
dataset_vignette( adae, display_vars = exprs( USUBJID, AEDECOD, AESEQ, AESTDTC, AEENDTC, ASTDT, AENDT, LDOSEDTM ) )
The variables ASEV
, AREL
, and ATOXGR
can be added using simple
dplyr::mutate()
assignments, if no imputation is required.
adae <- adae %>% mutate( ASEV = AESEV, AREL = AEREL )
To derive the treatment emergent flag TRTEMFL
, one can call
derive_var_trtemfl()
. In the example below, we use 30 days in the flag
derivation.
adae <- adae %>% derive_var_trtemfl( trt_start_date = TRTSDT, trt_end_date = TRTEDT, end_window = 30 )
dataset_vignette( adae, display_vars = exprs( USUBJID, TRTSDT, TRTEDT, AESTDTC, ASTDT, TRTEMFL ) )
To derive on-treatment flag (ONTRTFL
) in an ADaM dataset with a single occurrence date, we
use derive_var_ontrtfl()
.
The expected result is the input dataset with an additional column named
ONTRTFL
with a value of "Y"
or NA
.
If you want to also check an end date, you could add the end_date
argument.
Note that in this scenario you could set span_period = TRUE
if you want occurrences that started
prior to drug intake, and was ongoing or ended after this time to be considered as on-treatment.
bds1 <- tibble::tribble( ~USUBJID, ~ADT, ~TRTSDT, ~TRTEDT, "P01", ymd("2020-02-24"), ymd("2020-01-01"), ymd("2020-03-01"), "P02", ymd("2020-01-01"), ymd("2020-01-01"), ymd("2020-03-01"), "P03", ymd("2019-12-31"), ymd("2020-01-01"), ymd("2020-03-01") ) derive_var_ontrtfl( bds1, start_date = ADT, ref_start_date = TRTSDT, ref_end_date = TRTEDT ) bds2 <- tibble::tribble( ~USUBJID, ~ADT, ~TRTSDT, ~TRTEDT, "P01", ymd("2020-07-01"), ymd("2020-01-01"), ymd("2020-03-01"), "P02", ymd("2020-04-30"), ymd("2020-01-01"), ymd("2020-03-01"), "P03", ymd("2020-03-15"), ymd("2020-01-01"), ymd("2020-03-01") ) derive_var_ontrtfl( bds2, start_date = ADT, ref_start_date = TRTSDT, ref_end_date = TRTEDT, ref_end_window = 60 ) bds3 <- tibble::tribble( ~ADTM, ~TRTSDTM, ~TRTEDTM, ~TPT, "2020-01-02T12:00", "2020-01-01T12:00", "2020-03-01T12:00", NA, "2020-01-01T12:00", "2020-01-01T12:00", "2020-03-01T12:00", "PRE", "2019-12-31T12:00", "2020-01-01T12:00", "2020-03-01T12:00", NA ) %>% mutate( ADTM = ymd_hm(ADTM), TRTSDTM = ymd_hm(TRTSDTM), TRTEDTM = ymd_hm(TRTEDTM) ) derive_var_ontrtfl( bds3, start_date = ADTM, ref_start_date = TRTSDTM, ref_end_date = TRTEDTM, filter_pre_timepoint = TPT == "PRE" )
The function derive_var_extreme_flag()
can help derive variables such as
AOCCIFL
, AOCCPIFL
, AOCCSIFL
, and AOCCzzFL
.
If grades were collected, the following can be used to flag first occurrence of maximum toxicity grade.
adae <- adae %>% restrict_derivation( derivation = derive_var_extreme_flag, args = params( by_vars = exprs(USUBJID), order = exprs(desc(ATOXGR), ASTDTM, AESEQ), new_var = AOCCIFL, mode = "first" ), filter = TRTEMFL == "Y" )
Similarly, ASEV
can also be used to derive the occurrence flags, if severity is
collected. In this case, the variable will need to be recoded to a numeric variable.
Flag first occurrence of most severe adverse event:
adae <- adae %>% restrict_derivation( derivation = derive_var_extreme_flag, args = params( by_vars = exprs(USUBJID), order = exprs( as.integer(factor( ASEV, levels = c("DEATH THREATENING", "SEVERE", "MODERATE", "MILD") )), ASTDTM, AESEQ ), new_var = AOCCIFL, mode = "first" ), filter = TRTEMFL == "Y" )
dataset_vignette( adae, display_vars = exprs( USUBJID, ASTDTM, ASEV, AESEQ, TRTEMFL, AOCCIFL ) )
For deriving query variables SMQzzNAM
, SMQzzCD
, SMQzzSC
, SMQzzSCN
, or
CQzzNAM
the derive_vars_query()
function can be used. As input it expects a
queries dataset, which provides the definition of the queries. See Queries
dataset documentation for a detailed description of the
queries dataset. The create_query_data()
function can be used to create
queries datasets.
The following example shows how to derive query variables for Standardized MedDRA Queries (SMQs) in ADAE.
queries <- admiral::queries
dataset_vignette(queries)
adae1 <- tibble::tribble( ~USUBJID, ~ASTDTM, ~AETERM, ~AESEQ, ~AEDECOD, ~AELLT, ~AELLTCD, "01", "2020-06-02 23:59:59", "ALANINE AMINOTRANSFERASE ABNORMAL", 3, "Alanine aminotransferase abnormal", NA_character_, NA_integer_, "02", "2020-06-05 23:59:59", "BASEDOW'S DISEASE", 5, "Basedow's disease", NA_character_, 1L, "03", "2020-06-07 23:59:59", "SOME TERM", 2, "Some query", "Some term", NA_integer_, "05", "2020-06-09 23:59:59", "ALVEOLAR PROTEINOSIS", 7, "Alveolar proteinosis", NA_character_, NA_integer_ ) adae_query <- derive_vars_query(dataset = adae1, dataset_queries = queries)
dataset_vignette(adae_query)
Similarly to SMQ, the derive_vars_query()
function can be used to derive
Standardized Drug Groupings (SDG).
sdg <- tibble::tribble( ~PREFIX, ~GRPNAME, ~GRPID, ~SCOPE, ~SCOPEN, ~SRCVAR, ~TERMCHAR, ~TERMNUM, "SDG01", "Diuretics", 11, "BROAD", 1, "CMDECOD", "Diuretic 1", NA, "SDG01", "Diuretics", 11, "BROAD", 1, "CMDECOD", "Diuretic 2", NA, "SDG02", "Costicosteroids", 12, "BROAD", 1, "CMDECOD", "Costicosteroid 1", NA, "SDG02", "Costicosteroids", 12, "BROAD", 1, "CMDECOD", "Costicosteroid 2", NA, "SDG02", "Costicosteroids", 12, "BROAD", 1, "CMDECOD", "Costicosteroid 3", NA, ) adcm <- tibble::tribble( ~USUBJID, ~ASTDTM, ~CMDECOD, "01", "2020-06-02 23:59:59", "Diuretic 1", "02", "2020-06-05 23:59:59", "Diuretic 1", "03", "2020-06-07 23:59:59", "Costicosteroid 2", "05", "2020-06-09 23:59:59", "Diuretic 2" ) adcm_query <- derive_vars_query(adcm, sdg)
dataset_vignette(adcm_query)
ADSL
variables {#adsl_vars}If needed, the other ADSL
variables can now be added:
adae <- adae %>% derive_vars_merged( dataset_add = select(adsl, !!!negate_vars(adsl_vars)), by_vars = exprs(STUDYID, USUBJID) )
dataset_vignette( adae, display_vars = exprs( USUBJID, AEDECOD, ASTDTM, DTHDT, RFSTDTC, RFENDTC, AGE, AGEU, SEX ) )
The function derive_var_obs_number()
can be used for deriving ASEQ
variable to ensure the uniqueness of subject records within the dataset.
For example, there can be multiple records present in ADCM
for a single subject with the same ASTDTM
and CMSEQ
variables. But these records still differ at ATC level:
``` {r eval=TRUE, echo=TRUE} adcm <- tibble::tribble( ~USUBJID, ~ASTDTM, ~CMSEQ, ~CMDECOD, ~ATC1CD, ~ATC2CD, ~ATC3CD, ~ATC4CD, "BP40257-1001", "2013-07-05 UTC", "14", "PARACETAMOL", "N", "N02", "N02B", "N02BE", "BP40257-1001", "2013-08-15 UTC", "18", "SOLUMEDROL", "D", "D10", "D10A", "D10AA", "BP40257-1001", "2013-08-15 UTC", "18", "SOLUMEDROL", "D", "D07", "D07A", "D07AA", "BP40257-1001", "2013-08-15 UTC", "18", "SOLUMEDROL", "H", "H02", "H02A", "H02AB", "BP40257-1002", "2012-12-15 UTC", "19", "SPIRONOLACTONE", "C", "C03", "C03D", "C03DA" )
adcm_aseq <- adcm %>% derive_var_obs_number( by_vars = exprs(USUBJID), order = exprs(ASTDTM, CMSEQ, ATC1CD, ATC2CD, ATC3CD, ATC4CD), new_var = ASEQ, check_type = "error" )
```r dataset_vignette(adcm_aseq)
Adding labels and attributes for SAS transport files is supported by the following packages:
metacore: establish a common foundation for the use of metadata within an R session.
metatools: enable the use of metacore objects. Metatools can be used to build datasets or enhance columns in existing datasets as well as checking datasets against the metadata.
xportr: functionality to associate all metadata information to a local R data frame, perform data set level validation checks and convert into a transport v5 file(xpt).
NOTE: All these packages are in the experimental phase, but the vision is to have them associated with an End to End pipeline under the umbrella of the pharmaverse. An example of applying metadata and perform associated checks can be found at the pharmaverse E2E example.
ADaM | Sourcing Command
---- | --------------
ADAE | use_ad_template("ADAE")
ADCM | use_ad_template("ADCM")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.