In this vignette we show the general autoslider.core
workflow, how you can create functions that produce study-specific outputs, and how you can integrate them into the autoslider.core
framework.
knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Of course, you need to have the autoslider.core
package installed, and you need to have data available.
In this example I use example data stored in the autoslider.core
package.
The data needs to be stored in a named list where the names should correspond to ADaM data sets.
library(autoslider.core) library(dplyr)
# hidden setup # Install and load the necessary packages library(yaml) # Create the YAML content yaml_content <- ' ITT: title: Intent to Treat Population condition: ITTFL == "Y" target: adsl type: slref SAS: title: Secondary Analysis Set condition: SASFL == "Y" target: adsl type: slref SE: title: Safety Evaluable Population condition: SAFFL== "Y" target: adsl type: slref SER: title: Serious Adverse Events condition: AESER == "Y" target: adae type: anl LBCRP: title: CRP Values condition: PARAMCD == "CRP" target: adlb type: slref LBNOBAS: title: Only Visits After Baseline condition: ABLFL != "Y" & !(AVISIT %in% c("SCREENING", "BASELINE")) target: adlb type: slref ' # Create a temporary YAML file filters <- tempfile(fileext = ".yaml") # Write the YAML content to the temporary file write(yaml_content, file = filters) # Create the specs entry specs_entry <- ' - program: t_ds_slide titles: Patient Disposition ({filter_titles("adsl")}) footnotes: "t_ds footnotes" paper: L6 suffix: ITT - program: t_dm_slide titles: Patient Demographics and Baseline Characteristics footnotes: "t_dm_slide footnote" paper: L6 suffix: ITT args: arm: "TRT01A" vars: ["SEX", "AGE", "RACE", "ETHNIC", "COUNTRY"] - program: lbt06 titles: Patient Disposition ({filter_titles("adsl")}) footnotes: "t_ds footnotes" paper: L6 suffix: ITT_LBCRP_LBNOBAS ' # Create a temporary specs entry file spec_file <- tempfile(fileext = ".yaml") # Write the specs entry to the temporary file write(specs_entry, file = spec_file) # Clean up the temporary files # file.remove(filters) # file.remove(spec_file)
The folder structure could look something like:
├── programs │ ├── run_script.R │ ├── R | | ├── helping_functions.R | | ├── output_functions.R ├── outputs ├── specs.yml ├── filters.yml
The autoslideR
workflow would be implemented in the run_script.R
file.
This workflow does not require the files in R/
.
However, if custom output-creating functions are implemented, R/
would be the place to put them.
The autoslideR
workflow has four main aspects:
specs.yml
This file contains the specifications of all outputs you would like to create.
For each output we define specific information, namely the program name, the footnotes & titles, the paper (this indicates the orientation, P for portrait and L for landscape, the number indicates the font size), the suffix and args
.
It could look something like that:
- program: t_ds_slide titles: Patient Disposition ({filter_titles("adsl")}) footnotes: 't_ds footnotes' paper: L6 suffix: ITT - program: t_dm_slide titles: Patient Demographics and Baseline Characteristics footnotes: 't_dm_slide footnote' paper: L6 suffix: ITT args: arm: "TRT01A" vars: ["SEX", "AGE", "RACE", "ETHNIC", "COUNTRY"]
The program name refers to a function that produces an output.
This could be one of the _slide
functions in autoslider.core
or a custom function.
Titles and footnotes are added once the outputs are created. We refer to that as decorating the outputs.
The suffix specifies the name of the filters that are applied to the data, before the data is funneled into the function (program).
The filters themselves are specified in the filters.yml
file.
filters.yml
In filters.yml
we specify the names of the filters used across the outputs.
Each filter has a name (e.g. FAS
), a title (Full Analysis Set
), and then the filtering condition on a target dataset.
The filter title may be appended to the output title. For the t_ds_slides
slide above all filter titles that target the adsl dataset would be included in the brackets.
We would thus expect the title to read: "Patient Disposition (Full Analysis Set)"
[what is the type?]
As you can see, we don't just have population filters, but also filters on serious adverse events.
We can thus produce SAE tables by just supplying the serious adverse events to the AE table function.
This concept generalizes also to PARAMCD
values.
ITT: title: Intent to Treat Population condition: ITTFL =='Y' target: adsl type: slref SAS: title: Secondary Analysis Set condition: SASFL == 'Y' target: adsl type: slref SE: title: Safety Evaluable Population condition: SAFFL=='Y' target: adsl type: slref SER: title: Serious Adverse Events condition: AESER == 'Y' target: adae type: anl
You can find an overview of all autoslider.core
functions here.
Note that all output-producing functions end with _slide
while the prefix (i.e. t_
, l_
, g_
) specify the type of output (i.e. table, listing, or graph respectively).
Custom functions are needed if the autoslider.core
functions do not cover the outputs you need.
More on that further down.
A typical workflow could look something like this:
# define path to the yml files spec_file <- "spec.yml" filters <- "filters.yml"
library("dplyr") # load all filters filters::load_filters(filters, overwrite = TRUE) # read data data <- list( "adsl" = eg_adsl %>% mutate( FASFL = SAFFL, # add FASFL for illustrative purpose for t_pop_slide # DISTRTFL is needed for t_ds_slide but is missing in example data DISTRTFL = sample(c("Y", "N"), size = length(TRT01A), replace = TRUE, prob = c(.1, .9)) ) %>% preprocess_t_ds(), # this preproccessing is required by one of the autoslider.core functions "adae" = eg_adae, "adtte" = eg_adtte, "adrs" = eg_adrs, "adlb" = eg_adlb ) # create outputs based on the specs and the functions outputs <- spec_file %>% read_spec() %>% # we can also filter for specific programs, if we don't want to create them all filter_spec(., program %in% c( "t_ds_slide", "t_dm_slide" )) %>% # these filtered specs are now piped into the generate_outputs function. # this function also requires the data generate_outputs(datasets = data) %>% # now we decorate based on the specs, i.e. add footnotes and titles decorate_outputs( version_label = NULL )
We can have a look at one of the outputs stored in the outputs file:
outputs$t_dm_slide_ITT
Now we can save it to a slide.
For this example I store the output in a tempfile, you would likely store it in the outputs/
folder.
# Output to slides with template and color theme outputs %>% generate_slides( outfile = tempfile(fileext = ".ppts"), template = file.path(system.file(package = "autoslider.core"), "/theme/basic.pptx"), table_format = autoslider_format )
Unless your requirements are really specific, the most efficient way to write a study function is to base it off of a template function from the TLG catalogue.
The function you would want to create should take as input a list of datasets and potentially additional arguments.
Within the function, you should not worry about filtering the data, as this should be taken care of with the filters.yml
file and the general workflow.
To work properly with autoslider.core
your function should return either:
ggplot2
object for graphsrtables
object for tablesrlistings
object for listingsThat's it!
Now let's see how this works in practice.
A function that works within the autoslideR
workflow should also work on its own.
This makes it straightforward to develop and test.
As an example, lets create a function corresponding to a TLG catalogue output.
We are going create a table based on LBT06 Laboratory Abnormalities by Visit and Baseline Status:
lbt06 <- function(datasets) { # Ensure character variables are converted to factors and empty strings and NAs are explicit missing levels. adsl <- datasets$adsl %>% tern::df_explicit_na() adlb <- datasets$adlb %>% tern::df_explicit_na() # Please note that df_explict_na has a na_level argument defaulting to "<Missing>", # Please don't change the na_level to anything other than NA, empty string or the default "<Missing>". adlb_f <- adlb %>% dplyr::filter(ABLFL != "Y") %>% dplyr::filter(!(AVISIT %in% c("SCREENING", "BASELINE"))) %>% dplyr::mutate(AVISIT = droplevels(AVISIT)) %>% formatters::var_relabel(AVISIT = "Visit") adlb_f_crp <- adlb_f %>% dplyr::filter(PARAMCD == "CRP") # Define the split function split_fun <- rtables::drop_split_levels lyt <- rtables::basic_table(show_colcounts = TRUE) %>% rtables::split_cols_by("ARM") %>% rtables::split_rows_by("AVISIT", split_fun = split_fun, label_pos = "topleft", split_label = formatters::obj_label(adlb_f_crp$AVISIT) ) %>% tern::count_abnormal_by_baseline( "ANRIND", abnormal = c(Low = "LOW", High = "HIGH"), .indent_mods = 4L ) %>% tern::append_varlabels(adlb_f_crp, "ANRIND", indent = 1L) %>% rtables::append_topleft(" Baseline Status") result <- rtables::build_table( lyt = lyt, df = adlb_f_crp, alt_counts_df = adsl ) %>% rtables::trim_rows() result }
Let's see if this works:
lbt06(data)
This works!
To increase code-reusability and to have filtering control centralised in the filters.yml
file, I would recommend to remove most filtering processes from the function, namely the following chunk:
adlb_f <- eg_adlb %>% dplyr::filter(ABLFL != "Y") %>% dplyr::filter(!(AVISIT %in% c("SCREENING", "BASELINE"))) %>% dplyr::mutate(AVISIT = droplevels(AVISIT)) %>% formatters::var_relabel(AVISIT = "Visit") adlb_f_crp <- adlb_f %>% dplyr::filter(PARAMCD == "CRP")
For this, I'll add two separate filters into the filters.yml
file; one to filter the right parameter, and one to take care of the AVISIT and ABLFL.
LBCRP: title: CRP Values condition: PARAMCD == 'CRP' target: adlb type: slref LBNOBAS: title: Only Visits After Baseline condition: ABLFL != "Y" & !(AVISIT %in% c("SCREENING", "BASELINE")) target: adlb type: slref
And the corresponding specs entry:
- program: lbt06 titles: Patient Disposition ({filter_titles("adsl")}) footnotes: 't_ds footnotes' paper: L6 suffix: FAS_LBCRP_LBNOBAS
Now we can rewrite the function. But keep in mind: We are applying a filter to the ADSL data set but create the output on ADLB. To forward the filter to ADLB, we must semi-join the ADSL to ADLB.
lbt06 <- function(datasets) { # Ensure character variables are converted to factors and empty strings and NAs are explicit missing levels. adsl <- datasets$adsl %>% tern::df_explicit_na() adlb <- datasets$adlb %>% tern::df_explicit_na() # join adsl to adlb adlb <- adlb %>% semi_join(adsl, by = "USUBJID") # Please note that df_explict_na has a na_level argument defaulting to "<Missing>", # Please don't change the na_level to anything other than NA, empty string or the default "<Missing>". adlb_f <- adlb %>% dplyr::mutate(AVISIT = droplevels(AVISIT)) %>% formatters::var_relabel(AVISIT = "Visit") # Define the split function split_fun <- rtables::drop_split_levels lyt <- rtables::basic_table(show_colcounts = TRUE) %>% rtables::split_cols_by("ARM") %>% rtables::split_rows_by("AVISIT", split_fun = split_fun, label_pos = "topleft", split_label = formatters::obj_label(adlb_f_crp$AVISIT) ) %>% tern::count_abnormal_by_baseline( "ANRIND", abnormal = c(Low = "LOW", High = "HIGH"), .indent_mods = 4L ) %>% tern::append_varlabels(adlb_f, "ANRIND", indent = 1L) %>% rtables::append_topleft(" Baseline Status") result <- rtables::build_table( lyt = lyt, df = adlb_f, alt_counts_df = adsl ) %>% rtables::trim_rows() result }
lets do a dry-run before we integrate this function into the workflow:
# filter data | this step will be performed by the workflow later on adsl <- eg_adsl adlb <- eg_adlb adlb_f <- adlb %>% dplyr::filter(ABLFL != "Y") %>% dplyr::filter(!(AVISIT %in% c("SCREENING", "BASELINE"))) adlb_f_crp <- adlb_f %>% dplyr::filter(PARAMCD == "CRP") adsl_f <- adsl %>% filter(ITTFL == "Y")
lbt06(list(adsl = adsl_f, adlb = adlb_f_crp))
Looks like it works!
You have to keep in mind that the function you created must be in the global environment when calling the create_outputs
function.
This is the case for all autoslider.core
functions, as you attach the autoslider.core
package (with your library(autoslider.core)
call), so all (exported) function of the autoslider.core
package are available.
If you store your custom function in a separate script, you would need to source that script at some point before calling the function, i.e.:
source("programs/R/output_functions.R")
Now you just have to make sure the two .yml
files are correctly specified.
Set the path to the .yml
files.
filters <- "filters.yml" spec_file <- "specs.yml"
Then load the filters and generate the outputs.
filters::load_filters(filters, overwrite = TRUE) outputs <- spec_file %>% read_spec() %>% generate_outputs(data) %>% decorate_outputs() outputs$lbt06_ITT_LBCRP_LBNOBAS
Once this works, we can finally generate the slides.
filepath <- tempfile(fileext = ".pptx") generate_slides(outputs, outfile = filepath)
Of course, you would not use a temporary file, and you might want to use a custom .pptx
template for your slides.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.