knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(admiraldev)
This article describes creating an ADRS
ADaM dataset in ovarian cancer studies based on Gynecological Cancer Intergroup (GCIG) criteria.
Note that only the GCIG specific steps are covered in this vignette. To get a detailed guidance on all the steps, refer the Creating ADRS (Including Non-standard Endpoints).
Carcinoma antigen-125 (CA-125) is the most commonly used biomarker in ovarian cancer. The Gynecological Cancer Intergroup (GCIG) proposed the criteria for CA-125 response and progression and specified the situations in which CA-125 criteria could be used. These guidelines are becoming more and more popular in clinical trials for ovarian cancer and are often used as one of the secondary endpoints.
list_cases <- tibble::tribble( ~"Use case", ~"Use Recommended by GCIG", ~"Not Standard and Needs Further Validation", ~"Not Recommended by GCIG", "First-line trials", "CA-125 progression", "", "CA-125 response", "Maintenance or consolidation trials", "", "CA-125 response and progression ", "", "Relapse trials", "CA-125 response and progression", "", "" ) library(magrittr) list_cases %>% gt::gt() %>% gt::cols_label_with(fn = ~ gt::md(paste0("**", .x, "**"))) %>% gt::tab_header( title = "GCIG recommendations for CA-125 criteria for response and progression in various clinical situations" )
However, the CA-125 criteria can be a bit tricky to use in programming, especially when used alongside the RECIST 1.1.
We aim to share our current knowledge and experience in implementing GCIG criteria for ovarian clinical trials. Additionally, we have made certain assumptions regarding how data is collected on CRFs to perform response analysis according to the GCIG criteria. We hope this vignette provides valuable guidance on ADRS programming and highlights key considerations for data collection in relation to these criteria.
For more information about GCIG criteria user may visit GCIG guidelines on response criteria in ovarian cancer
In further considerations, ULRR stands for Upper Limit of Reference Range. The CA-125 response categories for ovarian cancer are:
CA-125 Complete Response: baseline CA-125 >= 2 * ULRR, later reduced by at least 50% to normal confirmed at least 4 weeks later.
CA-125 Partial Response: baseline CA-125 >= 2 * ULRR, later reduced by at least 50% but not to normal confirmed at least 4 weeks later.
Stable Disease: CA-125 level does not meet the criteria for either partial response or progression disease.
Progression: This is defined as CA-125 >= 2 * ULRR or CA-125 >= 2 * nadir on 2 occasions at least 1 week apart.
Not Evaluable: This is when the patient's response cannot be evaluated due to various reasons such as receiving mouse antibodies or having medical/surgical interference with their peritoneum or pleura during the previous 28 days.
LB
SDTM domain.RS
/TR
/TU
SDTM domain.Note: collection of CA-125 and RECIST 1.1 tumor assessment may not be always same visit. CA-125 can be collected more frequently than tumor assessment.
For this vignette we made assumptions that following information is collected on the CRF:
SDTM RS
domain. Please note that we are not programmatically confirming the CA-125 response.CR
, PR
, SD
, PD
, NE
as shown below.
```rlist_resp <- tibble::tribble( ~"CA-125 response per GCIG", ~"CA-125 response mapped", "Response within Normal Range", "CR", "Response but not within Normal Range", "PR", "Non-Response/Non-PD", "SD", "PD", "PD", "NE", "NE" ) knitr::kable(list_resp)
```
In SUPPRS
there are records with below QNAM
and QLABEL
values:
```r
list_supp <- tibble::tribble(
~"QNAM", ~"QLABEL", ~"QVAL", ~"Purpose", ~"Use case",
"CA125EFL
", "CA-125 response evaluable", "Y/N", "Indicates population evaluable for CA-125 response (baseline CA-125 >= 2 * ULRR and no mouse antibodies)", "CA125EFL
variable",
"CAELEPRE
", "Elevated pre-treatment CA-125", "Y/N", "Indicates CA-125 level at baseline (Y - elevated, N - not elevated)", "Derivation of PD category (MCRIT1
/MCRIT1ML
/MCRIT1MN
)",
"MOUSEANT
", "Received mouse antibodies", "Y", "Indicates if a prohibited therapy was received", "Derivation of ANL02FL
",
"CA50RED
", ">=50% reduction from baseline", "Y", "Indicates response, but does not distinguish between CR and PR", "Not used in further derivations",
"CANORM2X
", "CA125 normal, lab increased >=2x ULRR", "Y", "Indicates PD category A or C", "Derivation of PD category (MCRIT1
/MCRIT1ML
/MCRIT1MN
)",
"CNOTNORM
", "CA125 not norm, lab increased >=2x nadir", "Y", "Indicates PD category B", "Derivation of PD category (MCRIT1
/MCRIT1ML
/MCRIT1MN
)"
)
knitr::kable(list_supp)
```
The above SUPPRS
records refer only to records with RS.RSCAT = "CA125"
.
The exception is QNAM = "MOUSEANT"
, which appears for both RS.RSCAT = "CA125"
and RS.RSCAT = "RECIST 1.1 - CA125"
(for the same visits).
If the data are collected by other ways and similar information from SUPPRS
dataset are not available, they should be derived in advance from LB
CA-125
measurements records. Information on whether the patient has received mouse
antibodies should also be taken into account.
The examples of this vignette require the following packages.
library(admiral) library(admiralonco) library(pharmaversesdtm) library(pharmaverseadam) library(metatools) library(dplyr) library(tibble)
To start, all data frames needed for the creation of ADRS
should be read into the environment. This will be a company specific process. Some of the data frames needed are ADSL
and RS
.
For example purpose, the SDTM and ADaM datasets (based on CDISC Pilot
test data)---which are included in {pharmaversesdtm}
and {pharmaverseadam}
---are used.
adsl <- pharmaverseadam::adsl # GCIG sdtm data rs <- pharmaversesdtm::rs_onco_ca125 supprs <- pharmaversesdtm::supprs_onco_ca125 rs <- combine_supp(rs, supprs) rs <- convert_blanks_to_na(rs)
dataset_vignette( rs, display_vars = exprs(USUBJID, RSTESTCD, RSCAT, RSSTRESC, VISIT, CA125EFL, CAELEPRE, CA50RED, CANORM2X, CNOTNORM, MOUSEANT) )
At this step, it may be useful to join ADSL
to your RS
domain. Only the ADSL
variables used for derivations are selected at this step.
# select subjects from adsl such that there is one subject without RS data rs_subjects <- unique(rs$USUBJID) adsl_subjects <- unique(adsl$USUBJID) adsl <- filter( adsl, USUBJID %in% union(rs_subjects, setdiff(adsl_subjects, rs_subjects)[1]) )
adsl_vars <- exprs(RANDDT, TRTSDT) adrs <- derive_vars_merged( rs, dataset_add = adsl, new_vars = adsl_vars, by_vars = get_admiral_option("subject_keys") )
The next step is to assign parameter level values such as PARAMCD
, PARAM
,PARAMN
, PARCAT1
, etc.
For this, a lookup can be created based on the SDTM RSCAT
, RSTESTCD
and RSEVAL
values to join to the source data.
param_lookup <- tribble( ~RSCAT, ~RSTESTCD, ~RSEVAL, ~PARAMCD, ~PARAM, ~PARAMN, ~PARCAT1, ~PARCAT1N, ~PARCAT2, ~PARCAT2N, # CA-125 "CA125", "OVRLRESP", "INVESTIGATOR", "OVRCA125", "CA-125 Overall Response by Investigator", 1, "CA-125", 1, "Investigator", 1, # RECIST 1.1 "RECIST 1.1", "OVRLRESP", "INVESTIGATOR", "OVRR11", "RECIST 1.1 Overall Response by Investigator", 2, "RECIST 1.1", 2, "Investigator", 1, # Combined "RECIST 1.1 - CA125", "OVRLRESP", "INVESTIGATOR", "OVRR11CA", "Combined Overall Response by Investigator", 3, "Combined", 3, "Investigator", 1 )
This lookup may now be joined to the source data and this is how the parameters will look like:
adrs <- derive_vars_merged_lookup( adrs, dataset_add = param_lookup, by_vars = exprs(RSCAT, RSTESTCD, RSEVAL) )
dataset_vignette( adrs %>% arrange(!!!get_admiral_option("subject_keys"), PARAMN, RSSEQ), display_vars = exprs(USUBJID, VISIT, RSCAT, RSTESTCD, RSEVAL, PARAMCD, PARAM, PARCAT1, PARCAT2) )
ADT
, ADTF
, AVISIT
etc.If your data collection allows for partial dates, you could apply a company-specific imputation rule at this stage when deriving ADT
. For this example, here we impute missing day to last possible date.
adrs <- adrs %>% derive_vars_dt( dtc = RSDTC, new_vars_prefix = "A", highest_imputation = "D", date_imputation = "last" ) %>% derive_vars_dy( reference_date = TRTSDT, source_vars = exprs(ADT) ) %>% derive_vars_dtm( dtc = RSDTC, new_vars_prefix = "A", highest_imputation = "D", date_imputation = "last", flag_imputation = "time" ) %>% mutate(AVISIT = VISIT)
dataset_vignette( adrs, display_vars = exprs(USUBJID, PARAMCD, VISIT, AVISIT, RSDTC, ADT, ADTF, ADY) )
AVALC
and AVAL
Since the set of CA-125 response categories is a subset of the RECIST 1.1 response categories and the set of combined response categories overlaps with the set of RECIST 1.1 response categories, we can use the admiralonco::aval_resp()
function to assign AVAL
(ordered from best to worst response).
adrs <- adrs %>% mutate( AVALC = RSSTRESC, AVAL = aval_resp(AVALC) )
dataset_vignette( adrs, display_vars = exprs(USUBJID, PARAMCD, AVISIT, ADT, AVAL, AVALC) )
When deriving ANzzFL
this is an opportunity to exclude any records that should not contribute to any downstream parameter derivations.
ANL01FL
) {#anl01fl}In the below example:
STUDYID
and USUBJID
), PARAMCD
and ADT
are a unique key, mode
is being set in the admiral::derive_var_extreme_flag()
),worst_resp()
function.worst_resp <- function(arg) { case_when( arg == "NE" ~ 1, arg == "CR" ~ 2, arg == "PR" ~ 3, arg == "SD" ~ 4, arg == "NON-CR/NON-PD" ~ 5, arg == "PD" ~ 6, TRUE ~ 0 ) } adrs <- adrs %>% restrict_derivation( derivation = derive_var_extreme_flag, args = params( by_vars = c(get_admiral_option("subject_keys"), exprs(PARAMCD, ADT)), order = exprs(worst_resp(AVALC), RSSEQ), new_var = ANL01FL, mode = "last" ), filter = !is.na(AVAL) & ADT >= RANDDT )
ANL02FL
) {#anl02fl}To restrict response data up to and including first reported progressive disease ANL02FL
flag could be created by using {admiral}
function admiral::derive_var_relative_flag()
.
According to GCIG guidelines, assessments after patients received mouse antibodies or if there has been medical and/or surgical interference with their peritoneum or pleura during the previous 28 days should not be considered.
Note: In our vignette, we will use the variable MOUSEANT
, which indicates whether mouse antibodies have been received.
The user can similarly include a variable that indicates whether there has been medical and/or surgical interference
with the patient's peritoneum or pleura in the past 28 days.
adrs <- adrs %>% derive_var_relative_flag( by_vars = c(get_admiral_option("subject_keys"), exprs(PARAMCD)), order = exprs(ADT, RSSEQ), new_var = ANL02FL, condition = (AVALC == "PD" | MOUSEANT == "Y"), mode = "first", selection = "before", inclusive = TRUE )
Patients can only be evaluable for a CA-125 response if they have pre-treatment sample that is at least twice the upper limit of the reference range and within 2 weeks before starting the treatment. In our case, this information is collected only for CA-125 records while a flag is needed at the patient level.
CA-125 Response Evaluable Flag can easily be derived using derive_var_merged_exist_flag()
function.
adrs <- adrs %>% select(-CA125EFL) %>% derive_var_merged_exist_flag( dataset_add = adrs, by_vars = get_admiral_option("subject_keys"), new_var = CA125EFL, condition = (CA125EFL == "Y") )
dataset_vignette( adrs, display_vars = exprs(USUBJID, AVISIT, PARAMCD, AVALC, ADT, ANL01FL, ANL02FL, CA125EFL) )
For next parameter derivations we consider:
ANL01FL = "Y"
),ANL02FL = "Y"
),# used for derivation of CA-125 PD ovr_pd <- filter(adrs, PARAMCD == "OVRCA125" & ANL01FL == "Y" & ANL02FL == "Y") # used for derivation of CA-125 response parameters ovr_ca125 <- filter(adrs, PARAMCD == "OVRCA125" & CA125EFL == "Y" & ANL01FL == "Y" & ANL02FL == "Y") # used for derivation of unconfirmed best overall response from RECIST 1.1 and confirmed CA-125 together ovr_ubor <- filter(adrs, PARAMCD == "OVRR11CA" & CA125EFL == "Y" & ANL01FL == "Y" & ANL02FL == "Y") # used for derivation of confirmed best overall response from RECIST 1.1 and confirmed CA-125 together ovr_r11 <- filter(adrs, PARAMCD == "OVRR11" & CA125EFL == "Y" & ANL01FL == "Y" & ANL02FL == "Y")
Now that we have the input records prepared above with any company-specific requirements, we can start to derive new parameter records.
The function admiral::derive_extreme_records()
can be used to find the date of first CA-125 PD.
adrs <- adrs %>% derive_extreme_records( dataset_ref = adsl, dataset_add = ovr_pd, by_vars = get_admiral_option("subject_keys"), filter_add = AVALC == "PD", order = exprs(ADT), mode = "first", keep_source_vars = exprs(everything()), set_values_to = exprs( PARAMCD = "PDCA125", PARAM = "CA-125 Disease Progression by Investigator", PARAMN = 4, PARCAT1 = "CA-125", PARCAT1N = 1, PARCAT2 = "Investigator", PARCAT2N = 1, ANL01FL = "Y", ANL02FL = "Y" ) )
dataset_vignette( adrs %>% arrange(!!!get_admiral_option("subject_keys"), PARAMN, ADT), display_vars = exprs(USUBJID, PARAMCD, AVISIT, ADT, AVALC), filter = PARAMCD == "PDCA125" )
To obtain additional variables that store the progression category for participants who have progressed, we will create MCRITy/MCRITyML/MCRITyMN
variables according to the ADaMIG guidelines.
In our consideration we assume that the SUPPRS
dataset contains QNAM
values to uniquely classify a progression into one of the A
, B
or C
categories as per GCIG
criteria.
Having previously transposed and merged SUPPRS
dataset, we have a suitable structure for deriving MCRIT
variables, since all the necessary variables we will use to check conditions are in one row.
For this purpose {admiral}
provides derive_vars_cat()
function (see documentation for details).
definition_mcrit <- exprs( ~PARAMCD, ~condition, ~MCRIT1ML, ~MCRIT1MN, "PDCA125", CAELEPRE == "Y" & CANORM2X == "Y", "Patients with elevated CA-125 before treatment and normalization of CA-125 (A)", 1, "PDCA125", CAELEPRE == "Y" & CNOTNORM == "Y", "Patients with elevated CA-125 before treatment, which never normalizes (B)", 2, "PDCA125", CAELEPRE == "N" & CANORM2X == "Y", "Patients with CA-125 in the reference range before treatment (C)", 3 ) adrs <- adrs %>% mutate(MCRIT1 = if_else(PARAMCD == "PDCA125", "PD Category Group", NA_character_)) %>% derive_vars_cat( definition = definition_mcrit, by_vars = exprs(PARAMCD) )
dataset_vignette( adrs, display_vars = exprs(USUBJID, PARAMCD, AVISIT, ADT, AVALC, MCRIT1, MCRIT1ML, MCRIT1MN), filter = PARAMCD == "PDCA125" )
Derivation of the progression category may be more complex if the data is collected in a different way and the user needs to check whether:
If criterion to find PD category draws from multiple rows (different parameters or multiple rows for a single parameter) this will require the creation of a new parameter.
The function admiral::derive_extreme_event()
can be used to derive the CA-125 Best Confirmed Overall Response Parameter.
Some events such as bor_cr
, bor_pr
have been defined in {admiralonco}. Missing events specific to GCIG criteria are defined below.
Note: For SD
, it is not required as for RECIST 1.1 that the response occurs after a protocol-defined number of days.
bor_sd_gcig <- event( description = "Define stable disease (SD) for best overall response (BOR)", dataset_name = "ovr", condition = AVALC == "SD", set_values_to = exprs(AVALC = "SD") ) bor_ne_gcig <- event( description = "Define not evaluable (NE) for best overall response (BOR)", dataset_name = "ovr", condition = AVALC == "NE", set_values_to = exprs(AVALC = "NE") ) adrs <- adrs %>% derive_extreme_event( by_vars = get_admiral_option("subject_keys"), tmp_event_nr_var = event_nr, order = exprs(event_nr, ADT), mode = "first", source_datasets = list( ovr = ovr_ca125, adsl = adsl ), events = list( bor_cr, bor_pr, bor_sd_gcig, bor_pd, bor_ne_gcig, no_data_missing ), set_values_to = exprs( PARAMCD = "CBORCA", PARAM = "CA-125 Best Confirmed Overall Response by Investigator", PARAMN = 5, PARCAT1 = "CA-125", PARCAT1N = 1, PARCAT2 = "Investigator", PARCAT2N = 1, AVAL = aval_resp(AVALC), ANL01FL = "Y", ANL02FL = "Y" ) )
dataset_vignette( adrs, display_vars = exprs(USUBJID, AVISIT, PARAMCD, AVALC, ADT), filter = PARAMCD == "CBORCA" )
For patients who are measurable by both RECIST 1.1 and CA-125 the concept of Combined Best Overall Response is used. In our assumptions, RECIST 1.1 response is unconfirmed, and the combined response from RS
domain is based on that unconfirmed RECIST 1.1.
In this part of the vignette, we will derive Combined Best Unconfirmed Overall Response based on combined response (PARAMCD = "OVRR11CA"
) as collected on the CRF.
adrs <- adrs %>% derive_extreme_event( by_vars = get_admiral_option("subject_keys"), tmp_event_nr_var = event_nr, order = exprs(event_nr, ADT), mode = "first", source_datasets = list( ovr = ovr_ubor, adsl = adsl ), events = list( bor_cr, bor_pr, bor_sd_gcig, bor_pd, bor_ne_gcig, no_data_missing ), set_values_to = exprs( PARAMCD = "BORCA11", PARAM = "Combined Best Unconfirmed Overall Response by Investigator", PARAMN = 6, PARCAT1 = "Combined", PARCAT1N = 3, PARCAT2 = "Investigator", PARCAT2N = 1, AVAL = aval_resp(AVALC), ANL01FL = "Y", ANL02FL = "Y" ) )
dataset_vignette( adrs, display_vars = exprs(USUBJID, AVISIT, PARAMCD, AVALC, ADT), filter = PARAMCD == "BORCA11" )
For studies where ORR is one of the primary endpoints, best RECIST 1.1 response for CR and PR needs to be confirmed and maintained for at least 28 days. Due to the complexity of the problem, we will not address it in this vignette.
For examples on the additional endpoints, please see Creating ADRS (Including Non-standard Endpoints).
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.