knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This article describes creating a BDS time-to-event ADaM.
The main part in programming a time-to-event dataset is the definition of the
events and censoring times. {admiral}
supports single events like death or
composite events like disease progression or death. More than one source dataset
can be used for the definition of the event and censoring times.
Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.
The examples of this vignette require the following packages.
library(admiral) library(dplyr) library(admiral.test)
library(lubridate)
CNSR
, ADT
, STARTDT
)AVAL
)ASEQ
)To start, all datasets needed for the creation of the time-to-event dataset should be read into the environment. This will be a company specific process.
For example purpose, the ADaM datasets---which are included
in {admiral}
---are used.
data("adsl") data("adae")
CNSR
, ADT
, STARTDT
) {#parameters}To derive the parameter dependent variables like CNSR
, ADT
, STARTDT
,
EVNTDESC
, SRCDOM
, PARAMCD
, ... the derive_param_tte()
function can be
used. It adds one parameter to the input dataset with one observation per
subject. Usually it is called several times.
For each subject it is determined if an event occurred. In the affirmative the
analysis date ADT
is set to the earliest event date. If no event occurred, the
analysis date is set to the latest censoring date.
The events and censorings are defined by the event_source()
and the
censor_source()
class respectively. It defines
filter
parameter) of a source dataset (dataset_name
parameter) are potential events or censorings,CNSR
variable (censor
parameter), anddate
parameter).The date can be provided as date (--DT
variable), datetime (--DTM
variable),
or character ISO-8601 date (--DTC
variable).
CDISC strongly recommends CNSR = 0
for events and positive integers for
censorings. {admiral}
enforces this recommendation. Therefore the censor
parameter is available for censor_source()
only. It is defaulted to 1
.
The dataset_name
parameter expects a character value which is used as an
identifier. The actual data which is used for the derivation of the parameter is
provided via the source_datasets
parameter of derive_param_tte()
. It expects
a named list of datasets. The names correspond to the identifiers specified for
the dataset_name
parameter. This allows to define events and censoring
independent of the data.
The table below shows all pre-defined tte_source
objects which should cover the most common use cases.
knitr::kable(admiral:::list_tte_source_objects())
These pre-defined objects can be passed directly to derive_param_tte()
to create a new time-to-event parameter.
adtte <- derive_param_tte( dataset_adsl = adsl, start_date = TRTSDT, event_conditions = list(ae_ser_event), censor_conditions = list(lastalive_censor), source_datasets = list(adsl = adsl, adae = adae), set_values_to = vars(PARAMCD = "TTAESER", PARAM = "Time to First Serious AE") )
dataset_vignette( adtte, display_vars = vars(USUBJID, PARAMCD, PARAM, STARTDT, ADT, CNSR) )
For example, the overall survival time could be defined from treatment start to death. Patients alive or lost to follow-up would be censored to the last alive date. The following call defines a death event based on ADSL variables.
death <- event_source( dataset_name = "adsl", filter = DTHFL == "Y", date = DTHDT )
A corresponding censoring based on the last known alive date can be defined by the following call.
lstalv <- censor_source( dataset_name = "adsl", date = LSTALVDT )
The definitions can be passed to derive_param_tte()
to create a new time-to-event parameter.
adtte <- derive_param_tte( dataset_adsl = adsl, source_datasets = list(adsl = adsl), start_date = TRTSDT, event_conditions = list(death), censor_conditions = list(lstalv), set_values_to = vars(PARAMCD = "OS", PARAM = "Overall Survival") )
dataset_vignette( adtte, display_vars = vars(USUBJID, PARAMCD, PARAM, STARTDT, ADT, CNSR) )
Note that in practice for efficacy parameters you might use randomization date as the time to event origin date, but this variable is not in the CDISC Pilot ADSL
dataset so we used TRTSDT
for these examples.
EVNTDESC
, SRCVAR
, ...)To add additional information like event or censoring description (EVNTDESC
)
or source variable (SRCVAR
) the set_values_to
parameter can be specified in
the event/censoring definition.
# define death event # death <- event_source( dataset_name = "adsl", filter = DTHFL == "Y", date = DTHDT, set_values_to =vars( EVNTDESC = "DEATH", SRCDOM = "ADSL", SRCVAR = "DTHDT" ) ) # define censoring at last known alive date # lstalv <- censor_source( dataset_name = "adsl", date = LSTALVDT, set_values_to = vars( EVNTDESC = "LAST KNOWN ALIVE DATE", SRCDOM = "ADSL", SRCVAR = "LSTALVDT" ) ) # derive time-to-event parameter # adtte <- derive_param_tte( dataset_adsl = adsl, source_datasets = list(adsl = adsl), event_conditions = list(death), censor_conditions = list(lstalv), set_values_to = vars(PARAMCD = "OS", PARAM = "Overall Survival") )
dataset_vignette( adtte, display_vars = vars(USUBJID, EVNTDESC, SRCDOM, SRCVAR, CNSR, ADT) ) # save adtte and adsl for next section adtte_bak <- adtte adsl_bak <- adsl
If a subject has no event and has no record meeting the censoring rule, it will
not be included in the output dataset. In order to have a record for this
subject in the output dataset, another censoring_source()
object should be
created to specify how those patients will be censored. Therefore the start
censoring is defined below to achieve that subjects without data in adrs
are
censored at the start date.
The ADaM IG requires that computed date must be accompanied by imputation flags.
If the start date is imputed, the date imputation flag can be specified by the
start_date_imputation_flag
parameter. If a date variable from one of the event
or censoring source datasets is imputed, the imputation flag can be specified
for the set_values_to
parameter in event_source()
or censor_source()
(see
definition of the start
censoring below).
As the CDISC pilot does not contain a RS
dataset, the following example for
progression free survival uses manually created datasets.
View(adsl)
adsl <- tibble::tribble( ~USUBJID, ~DTHFL, ~DTHDT, ~TRTSDT, ~TRTSDTF, "01", "Y", ymd("2021-06-12"), ymd("2021-01-01"), "M", "02", "N", NA, ymd("2021-02-03"), NA, "03", "Y", ymd("2021-08-21"), ymd("2021-08-10"), NA, "04", "N", NA, ymd("2021-02-03"), NA, "05", "N", NA, ymd("2021-04-01"), "D" ) %>% mutate(STUDYID = "AB42") dataset_vignette( adsl, display_vars = vars(USUBJID, DTHFL, DTHDT, TRTSDT, TRTSDTF) )
View(adrs)
adrs <- tibble::tribble( ~USUBJID, ~AVALC, ~ADT, ~ASEQ, "01", "SD", ymd("2021-01-03"), 1, "01", "PR", ymd("2021-03-04"), 2, "01", "PD", ymd("2021-05-05"), 3, "02", "PD", ymd("2021-02-03"), 1, "04", "SD", ymd("2021-02-13"), 1, "04", "PR", ymd("2021-04-14"), 2, "04", "CR", ymd("2021-05-15"), 3 ) %>% mutate( STUDYID = "AB42", PARAMCD = "OVR", PARAM = "Overall Response" ) %>% select(STUDYID, USUBJID, PARAMCD, PARAM, ADT, ASEQ, AVALC) dataset_vignette( adrs, display_vars = vars(USUBJID, AVALC, ADT, ASEQ, PARAMCD, PARAM) )
An event for progression free survival occurs if
Therefore two event_source()
objects are defined:
pd
for progression of disease anddeath
for death.Some subjects may experience both events. In this case the first one is selected
by derive_param_tte()
.
# progressive disease event # pd <- event_source( dataset_name = "adrs", filter = AVALC == "PD", date = ADT, set_values_to = vars( EVNTDESC = "PD", SRCDOM = "ADRS", SRCVAR = "ADT", SRCSEQ = ASEQ ) ) # death event # death <- event_source( dataset_name = "adsl", filter = DTHFL == "Y", date = DTHDT, set_values_to = vars( EVNTDESC = "DEATH", SRCDOM = "ADSL", SRCVAR = "DTHDT" ) )
Subjects without event must be censored at the last tumor assessment. For the
censoring the lastvisit
object is defined as all tumor assessments. Please
note that it is not necessary to select the last one or exclude assessments
which resulted in progression of disease. This is handled within
derive_param_tte()
.
# last tumor assessment censoring (CNSR = 1 by default) # lastvisit <- censor_source( dataset_name = "adrs", date = ADT, set_values_to = vars( EVNTDESC = "LAST TUMOR ASSESSMENT", SRCDOM = "ADRS", SRCVAR = "ADT" ) )
Patients without tumor assessment should be censored at the start date.
Therefore the start
object is defined with the treatment start date as
censoring date. It is not necessary to exclude patient with tumor assessment in
the definition of start
because derive_param_tte()
selects the last date
across all censor_source()
objects as censoring date.
# start date censoring (for patients without tumor assessment) (CNSR = 2) # start <- censor_source( dataset_name = "adsl", date = TRTSDT, censor = 2, set_values_to = vars( EVNTDESC = "TREATMENT START", SRCDOM = "ADSL", SRCVAR = "TRTSDT", ADTF = TRTSDTF ) ) # derive time-to-event parameter # adtte <- derive_param_tte( dataset_adsl = adsl, source_datasets = list(adsl = adsl, adrs = adrs), start_date = TRTSDT, start_date_imputation_flag = TRTSDTF, event_conditions = list(pd, death), censor_conditions = list(lastvisit, start), set_values_to = vars(PARAMCD = "PFS", PARAM = "Progression Free Survival") )
dataset_vignette( adtte %>% select( STUDYID, USUBJID, PARAMCD, PARAM, STARTDT, STARTDTF, ADT, ADTF, CNSR, EVNTDESC, SRCDOM, SRCVAR ), display_vars = vars(USUBJID, PARAMCD, STARTDT, STARTDTF, ADT, ADTF, CNSR) )
If several similar time-to-event parameters need to be derived the
call_derivation()
function is useful.
In the following example parameters for time to first AE, time to first serious AE, and time to first related AE are derived. The censoring is the same for all three. Only the definition of the event differs.
adtte <- adtte_bak adsl <- adsl_bak
# define censoring # observation_end <- censor_source( dataset_name = "adsl", date = EOSDT, censor = 1, set_values_to = vars( EVNTDESC = "END OF STUDY", SRCDOM = "ADSL", SRCVAR = "EOSDT" ) ) # define time to first AE # tt_ae <- event_source( dataset_name = "ae", date = AESTDTC, set_values_to = vars( EVNTDESC = "ADVERSE EVENT", SRCDOM = "AE", SRCVAR = "AESTDTC" ) ) # define time to first serious AE # tt_ser_ae <- event_source( dataset_name = "ae", filter = AESER == "Y", date = AESTDTC, set_values_to = vars( EVNTDESC = "SERIOUS ADVERSE EVENT", SRCDOM = "AE", SRCVAR = "AESTDTC" ) ) # define time to first related AE # tt_rel_ae <- event_source( dataset_name = "ae", filter = AEREL %in% c("PROBABLE", "POSSIBLE", "REMOTE"), date = AESTDTC, set_values_to = vars( EVNTDESC = "RELATED ADVERSE EVENT", SRCDOM = "AE", SRCVAR = "AESTDTC" ) ) # derive all three time to event parameters # adaette <- call_derivation( derivation = derive_param_tte, variable_params = list( params( event_conditions = list(tt_ae), set_values_to = vars(PARAMCD = "TTAE") ), params( event_conditions = list(tt_ser_ae), set_values_to = vars(PARAMCD = "TTSERAE") ), params( event_conditions = list(tt_rel_ae), set_values_to = vars(PARAMCD = "TTRELAE") ) ), dataset_adsl = adsl, source_datasets = list(adsl = adsl, ae = ae), censor_conditions = list(observation_end) )
adaette %>% select(STUDYID, USUBJID, PARAMCD, STARTDT, ADT, CNSR, EVNTDESC, SRCDOM, SRCVAR) %>% arrange(USUBJID, PARAMCD) %>% dataset_vignette(display_vars = vars(USUBJID, PARAMCD, STARTDT, ADT, CNSR, EVNTDESC, SRCDOM, SRCVAR))
If time-to-event parameters need to be derived for each by group of a source
dataset, the by_vars
parameter can be specified. Then a time-to-event
parameter is derived for each by group.
Please note that CDISC requires separate parameters (PARAMCD
, PARAM
) for the
by groups. Therefore the variables specified for the by_vars
parameter are not
included in the output dataset. The PARAMCD
variable should be specified for
the set_value_to
parameter using an expression on the right hand side which
results in a unique value for each by group. If the values of the by variables
should be included in the output dataset, they can be stored in PARCATn
variables.
In the following example a time-to-event parameter for each preferred term in the AE dataset is derived.
View(adsl)
adsl <- tibble::tribble( ~USUBJID, ~TRTSDT, ~EOSDT, "01", ymd("2020-12-06"), ymd("2021-03-06"), "02", ymd("2021-01-16"), ymd("2021-02-03") ) %>% mutate(STUDYID = "AB42") dataset_vignette(adsl)
View(ae)
ae <- tibble::tribble( ~USUBJID, ~AESTDTC, ~AESEQ, ~AEDECOD, "01", "2021-01-03T10:56", 1, "Flu", "01", "2021-03-04", 2, "Cough", "01", "2021", 3, "Flu" ) %>% mutate(STUDYID = "AB42") dataset_vignette(ae)
# define time to first adverse event event # ttae <- event_source( dataset_name = "ae", date = AESTDTC, set_values_to = vars( EVNTDESC = "AE", SRCDOM = "AE", SRCVAR = "AESTDTC", SRCSEQ = AESEQ ) ) # define censoring at end of study # eos <- censor_source( dataset_name = "adsl", date = EOSDT, set_values_to = vars( EVNTDESC = "END OF STUDY", SRCDOM = "ADSL", SRCVAR = "EOSDT" ) ) # derive time-to-event parameter # adtte <- derive_param_tte( dataset_adsl = adsl, by_vars = vars(AEDECOD), start_date = TRTSDT, event_conditions = list(ttae), censor_conditions = list(eos), source_datasets = list(adsl = adsl, ae = ae), set_values_to = vars( PARAMCD = paste0("TTAE", as.numeric(as.factor(AEDECOD))), PARAM = paste("Time to First", AEDECOD, "Adverse Event"), PARCAT1 = "TTAE", PARCAT2 = AEDECOD ) )
dataset_vignette( adtte %>% select( USUBJID, STARTDT, PARAMCD, PARAM, PARCAT1, PARCAT2, ADT, CNSR, EVNTDESC, SRCDOM, SRCVAR, SRCSEQ ), display_vars = vars(USUBJID, STARTDT, PARAMCD, PARAM, ADT, CNSR, SRCSEQ) )
AVAL
) {#aval}The analysis value (AVAL
) can be derived by calling derive_vars_duration()
.
This example derives the time to event in days. Other units can be requested by
the specifying the out_unit
parameter.
adtte <- adtte_bak adsl <- adsl_bak
adtte <- derive_vars_duration( adtte, new_var = AVAL, start_date = STARTDT, end_date = ADT )
dataset_vignette(
adtte)
ASEQ
) {#aseq}The {admiral}
function derive_var_obs_number()
can be used to derive ASEQ
:
adtte <- derive_var_obs_number( adtte, by_vars = vars(STUDYID, USUBJID), order = vars(PARAMCD), check_type = "error" )
dataset_vignette(adtte)
Variables from ADSL which are required for time-to-event analyses, e.g.,
treatment variables or covariates can be added using left_join()
.
adtte <- left_join( adtte, select(adsl, STUDYID, USUBJID, ARMCD, ARM, ACTARMCD, ACTARM, AGE, SEX), by = c("STUDYID", "USUBJID") )
dataset_vignette(adtte)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.