knitr::opts_chunk$set(echo = TRUE, cache = FALSE, message = FALSE, warning = FALSE)
suppressPackageStartupMessages(library(tidyverse)) suppressPackageStartupMessages(library(knitr)) suppressPackageStartupMessages(library(simaerep)) suppressPackageStartupMessages(library(haven))
Typically clinical data is stored in several SAS files in a standardized format. We need the files in which the visits and the AEs are recorded. For this demo we have selected an anonymized data set which only contains patients enrolled into the control arm. In those data sets the AE onset dates and the visit dates have been replaced with the number of days that have passed since a specific cut-off date. We can proceed in a similar way
df_ae <- haven::read_sas('adae.sas7bdat') %>% select(STUDYID, SUBJID, SITEID, AESTDY) df_ae df_vs <- haven::read_sas('advs.sas7bdat') %>% select(STUDYID, SUBJID, SITEID, ADY) df_vs
In order to assign each AE to a visit we union both event tables and sort by date.
df_ae <- df_ae %>% rename(DY = AESTDY) %>% mutate(EVENT = "AE") df_vs <- df_vs %>% rename(DY = ADY) %>% mutate(EVENT = "VS") %>% # we ignore visits that have no date filter(! is.na(DY)) %>% # we are not interested in same day visits distinct() df_aevs <- bind_rows(df_ae, df_vs) %>% # NA's get sorted towards the end thus AEs with no date get sorted towards last visit arrange(STUDYID, SITEID, SUBJID, DY) %>% group_by(STUDYID, SITEID, SUBJID) %>% mutate(AE_NO = cumsum(ifelse(EVENT == "AE", 1, 0)), VS_NO = cumsum(ifelse(EVENT == "VS", 1, 0))) %>% # we remove patients with 0 visits filter(max(VS_NO) > 0) %>% # AE's before fist visit should register to visit 1 not zero mutate(VS_NO = ifelse(VS_NO == 0, 1, VS_NO))
patient example with AE before first visit and AEs with NA in date
df_aevs %>% filter(SUBJID == "01007004") %>% knitr::kable()
Then we aggregate on visit number.
df_aevs_aggr <- df_aevs %>% group_by(STUDYID, SITEID, SUBJID, VS_NO) %>% summarise(MIN_AE_NO = min(AE_NO), MAX_AE_NO = max(AE_NO), .groups = "drop") %>% group_by(STUDYID, SITEID, SUBJID) %>% mutate(MAX_VS_PAT = max(VS_NO)) %>% ungroup() %>% # assign AEs that occur after last visit to last AE mutate( CUM_AE = ifelse( VS_NO == MAX_VS_PAT, MAX_AE_NO, MIN_AE_NO) )
Same patient example as before.
df_aevs_aggr %>% filter(SUBJID == "01007004") %>% knitr::kable()
As a control we check whether the numbers of visits and AEs of our processed data still matches the number of AEs in our original data.
stopifnot(nrow(df_aevs_aggr) == nrow(df_vs)) n_aes <- df_aevs_aggr %>% group_by(SUBJID) %>% summarize(n_aes = max(CUM_AE)) %>% pull(n_aes) %>% sum() n_aes_original <- df_ae %>% # all AEs for patients with more than 1 visit filter(SUBJID %in% df_aevs$SUBJID) %>% nrow() stopifnot(n_aes == n_aes_original)
After renaming some of the columns we can pass the aggregated data from the SAS files to simaerep
df_visit <- df_aevs_aggr %>% rename( study_id = "STUDYID", site_number = "SITEID", patnum = "SUBJID", n_ae = "CUM_AE", visit = "VS_NO" ) %>% select(study_id, site_number, patnum, n_ae, visit) df_visit aerep <- simaerep(df_visit) plot(aerep)
Left panel shows mean AE reporting per site (lightblue and darkblue lines) against mean AE reporting of the entire study (golden line). Single sites are plotted in descending order by AE under-reporting probability on the right panel in which grey lines denote cumulative AE count of single patients. Grey dots in the left panel plot indicate sites that were picked for single plotting. AE under-reporting probability of dark blue lines crossed threshold of 95%. Numbers in the upper left corner indicate the ratio of patients that have been used for the analysis against the total number of patients. Patients that have not been on the study long enough to reach the evaluation point (visit_med75, see introduction) will be ignored.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.