library(knitr) opts_chunk$set( echo = TRUE, results = 'asis', warning = FALSE, error = FALSE, message = FALSE, cache = FALSE, fig.width = 8, fig.height = 7, fig.path = "./figures_vignette/", fig.align = 'center')
This vignette focuses on the visualizations available in the
clinDataReview
package.
We will use example data sets from the clinUtils
package.
If you have doubts on the data format, please check first the vignette on data preprocessing available at: here.
If everything is clear on that side, let's get started!
Please note that the patient profiles and interactive visualizations are only displayed in the vignette if Pandoc is available.
hasPandoc <- rmarkdown::pandoc_available()
library(clinDataReview) library(plotly)
library(clinUtils) data(dataADaMCDISCP01) labelVars <- attr(dataADaMCDISCP01, "labelVars") varsLB <- c( "PARAM", "PARAMCD", "USUBJID", "TRTP", "ADY", "VISITNUM", "VISIT", "LBSTRESN" ) dataLB <- dataADaMCDISCP01$ADLBC[, varsLB] varsAE <- c("USUBJID", "AESOC", "AEDECOD", "ASTDY", "AENDY", "AESEV") dataAE <- dataADaMCDISCP01$ADAE[, varsAE] varsDM <- c("RFSTDTC", "USUBJID") dataDM <- dataADaMCDISCP01$ADSL[, varsDM]
The interactive visualizations of the clinical data package include
functionalities to link a plot to patient-specific report, e.g. patient
profiles created with the patientProfilesVis
package.
Such patient profiles can be created via a config file, with a dedicated
template report available in the clinDataReview
package.
A simple patient profile report for each subject in the example dataset is created below.
Please note that the patient profiles are created and included in
the interactive visualizations only during an interactive session
(via interactive()
) .
# please change to the working directory # (a temporary directory is used for the vignette) dir <- tempdir() # create a directory to store the patient profiles: patientProfilesDir <- file.path(dir, "patientProfiles") dir.create(patientProfilesDir) # get examples of parameters for the report configDir <- system.file("skeleton", "config", package = "clinDataReview") params <- getParamsFromConfig( configDir = configDir, configFile = "config-patientProfiles.yml" ) # create patient profile with only one panel for the demo params$patientProfilesParams <- params$patientProfilesParams[1] # use dataset from the clinUtils package params$pathDataFolder <- system.file("extdata", "cdiscpilot01", "SDTM", package = "clinUtils") # store patient profile in this folder: params$patientProfilePath <- patientProfilesDir # create patient profiles pathTemplatePkg <- clinDataReview::getPathTemplate(params$template) pathTemplateUser <- file.path(dir, basename(pathTemplatePkg)) tmp <- file.copy(from = pathTemplatePkg, to = pathTemplateUser) report <- rmarkdown::render( input = pathTemplateUser, envir = new.env() ) # clean-up unlink(pathTemplateUser);unlink(report)
Please refer to the vignette about reporting for more details on how to set up a config file and use template reports available in the package.
You can directly skip to reporting vignette, which is available here or run in your console the command below.
vignette("clinDataReview-reporting", "clinDataReview")
All the visualizations available in the package are interactive.
Visualization of individual profiles is available via the function
scatterplotClinData
.
To facilitate the exploration of the data, the underlying data
behind each visualization can be included as a table as well
below the plot by setting the parameter table
to TRUE.
Please note that this functionality is not demonstrated in this document to ensure a lightweight vignette in the package.
Subject-specific report (e.g. patient profiles) can be linked to each
profile (pathVar
parameter).
If the user clicks on the 'P' key while hovering on the plot, or click on the specific subject in the attached table, the specific patient profile is opened in a new window in the browser.
labParam <- "ALT" dataPlot <- subset(dataLB, PARAMCD == labParam) visitLab <- with(dataPlot, tapply(ADY, VISIT, median)) names(visitLab) <- sub("-", "\n", names(visitLab)) # link to patient profiles if(interactive()) dataPlot$patientProfilePath <- paste0( "patientProfiles/subjectProfile-", sub("/", "-", dataPlot$USUBJID), ".pdf" ) scatterplotClinData( data = dataPlot, xVar = "ADY", yVar = "LBSTRESN", aesPointVar = list(color = "TRTP", fill = "TRTP"), aesLineVar = list(group = "USUBJID", color = "TRTP"), linePars = list(size = 0.5, alpha = 0.7), hoverVars = c("USUBJID", "VISIT", "ADY", "LBSTRESN", "TRTP"), labelVars = labelVars, xPars = list(breaks = visitLab, labels = names(visitLab)), title = paste("Actual value of", getLabelParamcd( paramcd = labParam, data = dataLB, paramcdVar = "PARAMCD", paramVar = "PARAM" ) ), # include link to patient profiles: pathVar = if(interactive()) "patientProfilePath", subtitle = paste( "Profile plot by subject", "Points are positioned at relative day", "Visits are positioned based on median relative day across subjects.", sep = "\n" ), verbose = TRUE, table = FALSE )
# format data long -> wide format (one column per lab param) dataPlot <- subset(dataLB, PARAMCD %in% c("ALT", "ALB")) dataPlot <- stats::aggregate( LBSTRESN ~ USUBJID + VISIT + VISITNUM + PARAMCD, data = dataPlot, FUN = mean ) dataPlotWide <- stats::reshape( data = dataPlot, timevar = "PARAMCD", idvar = c("USUBJID", "VISIT", "VISITNUM"), direction = "wide" ) colnames(dataPlotWide) <- sub("^LBSTRESN.", "", colnames(dataPlotWide)) # link to patient profiles if(interactive()) dataPlotWide$patientProfilePath <- paste0( "patientProfiles/subjectProfile-", sub("/", "-", dataPlotWide$USUBJID), ".pdf" ) # scatterplot per visit scatterplotClinData( data = dataPlotWide, xVar = "ALT", yVar = "ALB", xLab = getLabelParamcd( paramcd = "ALT", data = dataLB, paramcdVar = "PARAMCD", paramVar = "PARAM" ), yLab = getLabelParamcd( paramcd = "ALB", data = dataLB, paramcdVar = "PARAMCD", paramVar = "PARAM" ), aesPointVar = list(color = "USUBJID", fill = "USUBJID"), facetPars = list(facets = ~ VISIT), labelVars = labelVars, pathVar = if(interactive()) "patientProfilePath", table = FALSE )
dataALT <- subset(dataLB, PARAMCD == "ALT") dataBILI <- subset(dataLB, PARAMCD == "BILI") byVar <- c("USUBJID", "VISIT") dataPlot <- merge( x = dataALT, y = dataBILI[, c(byVar, "LBSTRESN")], by = c("USUBJID", "VISIT"), suffixes = c(".ALT", ".BILI"), all = TRUE ) labelVars[paste0("LBSTRESN.", c("ALT", "BILI"))] <- paste( "Actual value of", getLabelParamcd( paramcd = c("ALT", "BILI"), data = dataLB, paramcdVar = "PARAMCD", paramVar = "PARAM" ) ) # link to patient profiles if(interactive()) dataPlot$patientProfilePath <- paste0( "patientProfiles/subjectProfile-", sub("/", "-", dataPlot$USUBJID), ".pdf" ) # scatterplot per visit scatterplotClinData( data = dataPlot, xVar = "LBSTRESN.ALT", yVar = "LBSTRESN.BILI", xLab = getLabelParamcd( paramcd = "ALT", data = dataLB, paramcdVar = "PARAMCD", paramVar = "PARAM" ), yLab = getLabelParamcd( paramcd = "BILI", data = dataLB, paramcdVar = "PARAMCD", paramVar = "PARAM" ), aesPointVar = list(color = "VISIT", fill = "VISIT"), xTrans = "log10", yTrans = "log10", hoverVars = "USUBJID", labelVars = labelVars, table = FALSE, pathVar = if(interactive()) "patientProfilePath" )
Time-intervals are displayed with the timeProfileIntervalPlot
function:
# link to patient profiles if(interactive()) dataAE$patientProfilePath <- paste0( "patientProfiles/subjectProfile-", sub("/", "-", dataAE$USUBJID), ".pdf" ) timeProfileIntervalPlot( data = dataAE, paramVar = "USUBJID", # time-variables timeStartVar = "ASTDY", timeEndVar = "ASTDY", colorVar = "AESEV", hoverVars = c("USUBJID", "AEDECOD", "ASTDY", "AENDY", "AESEV"), labelVars = labelVars, table = FALSE, pathVar = if(interactive()) "patientProfilePath", tableVars = c("USUBJID", "AEDECOD", "ASTDY", "AENDY", "AESEV") )
By default, empty intervals are represented if the start/end time variables are missing. Missing start/end time can be imputed, or different symbols can be used to represent such cases:
# create variable to indicate status of start/end date dataAE$AESTFLG <- ifelse(is.na(dataAE$ASTDY), "Missing start", "Complete") dataAE$AEENFLG <- ifelse(is.na(dataAE$AENDY), "Missing end", "Complete") shapePalette <- c( `Missing start` = "triangle-left", `Complete` = "square-open", `Missing end` = "triangle-right" ) # 'simple'-imputation: # if start is missing, 'Missing' symbol displayed at end interval dataAE$AESTDYIMP <- with(dataAE, ifelse(is.na(ASTDY), AENDY, ASTDY)) # if end is missing, 'Missing' symbol displayed at start interval dataAE$AEENDYIMP <- with(dataAE, ifelse(is.na(AENDY), ASTDY, AENDY)) timeProfileIntervalPlot( data = dataAE, paramVar = "USUBJID", # time-variables timeStartVar = "AESTDYIMP", timeStartLab = "Start day", timeEndVar = "AEENDYIMP", timeEndLab = "End day", # shape variables timeStartShapeVar = "AESTFLG", timeStartShapeLab = "Status of start date", timeEndShapeVar = "AEENFLG", timeEndShapeLab = "Status of end date", shapePalette = shapePalette, hoverVars = c("USUBJID", "AEDECOD", "AESEV", "ASTDY", "AESTFLG", "AENDY", "AEENFLG"), labelVars = labelVars, table = FALSE, tableVars = c("USUBJID", "AEDECOD", "AESEV", "ASTDY", "AESTFLG", "AENDY", "AEENFLG"), pathVar = if(interactive()) "patientProfilePath" )
Summary statistics can also be visualized with the package, via different types of visualizations: sunburst, treemap and barplot.
These functions take as input a table of summary statistics, especially counts.
Such table can e.g. computed with the inTextSummaryTable
R package (see
corresponding package vignette for more information).
Subject-specific report (e.g. patient profiles) can be linked to each profile
(pathVar
parameter). If the user clicks on the 'P' key while hovering on
the plot, a zip file containing the reports for all corresponding patients is
downloaded. If the attached table is display, each row can be extended to
display the links of the respective patient profile reports.
In this example, counts of adverse events are extracted for each
r labelVars["AESOC"]
and r labelVars["AEDECOD"]
.
Besides the counts of the number of subjects, the paths to the patient profile report for each subgroup are extracted and combined.
# total counts: Safety Analysis Set (patients with start date for the first treatment) dataTotal <- subset(dataDM, RFSTDTC != "") ## patient profiles report if(interactive()){ # add path in data dataAE$patientProfilePath <- paste0( "patientProfiles/subjectProfile-", sub("/", "-", dataAE$USUBJID), ".pdf" ) # add link in data (for attached table) dataAE$patientProfileLink <- with(dataAE, paste0( '<a href="', patientProfilePath, '" target="_blank">', USUBJID, '</a>' ) ) # Specify extra summarizations besides the standard stats # When the data is summarized, # the patient profile path are summarized # as well across patients # (the paths should be collapsed with: ', ') statsExtraPP <- list( statPatientProfilePath = function(data) toString(sort(unique(data$patientProfilePath))), statPatientProfileLink = function(data) toString(sort(unique(data$patientProfileLink))) ) } # get counts (records, subjects, % subjects) + stats with subjects profiles path statsPP <- c( inTextSummaryTable::getStats(type = "count"), if(interactive()) list( patientProfilePath = quote(statPatientProfilePath), patientProfileLink = quote(statPatientProfileLink) ) ) dataAE$AESEV <- factor( dataAE$AESEV, levels = c("MILD", "MODERATE", "SEVERE") ) dataAE$AESEVN <- as.numeric(dataAE$AESEV) # compute adverse event table tableAE <- inTextSummaryTable::computeSummaryStatisticsTable( data = dataAE, rowVar = c("AESOC", "AEDECOD"), dataTotal = dataTotal, labelVars = labelVars, # The total across the variable used for the nodes # should be specified rowVarTotalInclude = c("AESOC", "AEDECOD"), rowOrder = "total", # statistics of interest # include columns with patients stats = statsPP, # add extra 'statistic': concatenate subject IDs statsExtra = if(interactive()) statsExtraPP ) knitr::kable(head(tableAE), caption = paste("Extract of the Adverse Event summary table", "used for the sunburst and barplot visualization" ) )
The sunburstClinData
function visualizes the counts of hierarchical data
in nested circles.
The different groups are visualized from the biggest class (root node) in the center of the visualization to the smallest sub-groups (leaves) on the outside of the circles.
The size of the different segments is relative the respective counts.
dataSunburst <- tableAE dataSunburst$m <- as.numeric(dataSunburst$m) sunburstClinData( data = dataSunburst, vars = c("AESOC", "AEDECOD"), valueVar = "m", valueLab = "Number of adverse events", pathVar = if(interactive()) "patientProfileLink", pathLab = clinUtils::getLabelVar(var = "USUBJID", labelVars = labelVars), table = FALSE, labelVars = labelVars )
A treemap visualizes the counts of the hierarchical data in nested rectangles. The area of each rectangle is proportional to the counts of the respective group.
Note, that a treemap can also be colored accordingly to a meaningful variable. For instance, if we show adverse events, we might color the plot by severity. This can be achieved with the colorVar parameter.
dataTreemap <- tableAE dataTreemap$m <- as.numeric(dataTreemap$m) treemapClinData( data = dataTreemap, vars = c("AESOC", "AEDECOD"), valueVar = "m", valueLab = "Number of adverse events", pathVar = if(interactive()) "patientProfileLink", pathLab = clinUtils::getLabelVar(var = "USUBJID", labelVars = labelVars), table = FALSE, labelVars = labelVars )
A barplot visualizes the counts for one single variable in a specific order.
dataPlot <- subset(tableAE, AEDECOD != "Total") dataPlot$n <- as.numeric(dataPlot$n) # create plot barplotClinData( data = dataPlot, xVar = "AEDECOD", yVar = "n", yLab = "Number of patients with adverse events", textVar = "n", labelVars = labelVars, pathVar = if(interactive()) "patientProfileLink", pathLab = clinUtils::getLabelVar(var = "USUBJID", labelVars = labelVars), table = FALSE )
dataVSDIABP <- subset(dataADaMCDISCP01$ADVS, PARAMCD == "DIABP" & ANL01FL == "Y" & AVISIT %in% c("Baseline", "Week 2", "Week 4", "Week 6", "Week 8") ) # add link to patient profiles report # add path in data if(interactive()){ dataVSDIABP$patientProfilePath <- paste0( "patientProfiles/subjectProfile-", sub("/", "-", dataVSDIABP$USUBJID), ".pdf" ) # add link in data (for attached table) dataVSDIABP$patientProfileLink <- with(dataVSDIABP, paste0( '<a href="', patientProfilePath, '" target="_blank">', USUBJID, '</a>' ) ) }
# Specify extra summarizations besides the standard stats # When the data is summarized, # the patient profile path are summarized # as well across patients # (the paths should be collapsed with: ', ') if(interactive()) statsExtraPP <- list( statPatientProfilePath = function(data) toString(sort(unique(data$patientProfilePath))), statPatientProfileLink = function(data) toString(sort(unique(data$patientProfileLink))) ) # get default counts + stats with subjects profiles path statsPP <- c( inTextSummaryTable::getStats(x = dataVSDIABP$AVAL, type = "summary"), if(interactive()) list( patientProfilePath = quote(statPatientProfilePath), patientProfileLink = quote(statPatientProfileLink) ) ) # compute summary table of actual value summaryTableCont <- inTextSummaryTable::computeSummaryStatisticsTable( data = dataVSDIABP, rowVar = c("AVISIT", "ATPT"), var = "AVAL", labelVars = labelVars, # statistics of interest # for DT output, include columns with patients stats = statsPP, # add extra 'statistic': concatenate subject IDs statsExtra = if(interactive()) statsExtraPP ) knitr::kable(head(summaryTableCont, 1))
dataPlot <- subset(summaryTableCont, !isTotal) errorbarClinData( data = dataPlot, xVar = "AVISIT", colorVar = "ATPT", # use non-rounded statistics for the plot yVar = "statMean", yErrorVar = "statSE", # display rounded stats in the hover + n: number of subjects hoverVars = c("AVISIT", "ATPT", "n", "Mean", "SE"), yLab = "Mean", yErrorLab = "Standard Error", title = "Diastolic Blood Pressure summary profile by actual visit and and analysis timepoint", # include lines connecting the error bars mode = "markers+lines", table = FALSE, labelVars = labelVars, pathVar = if(interactive()) "patientProfileLink", pathLab = clinUtils::getLabelVar(var = "USUBJID", labelVars = labelVars) )
A boxplot visualizes the distribution of a continuous variable of interest versus specific categorical variables.
This visualization doesn't rely on pre-computed statistics, so the continuous variable of interest is directly passed to the functionality.
dataPlot <- subset(dataADaMCDISCP01$ADVS, PARAMCD == "DIABP" & ANL01FL == "Y" & AVISIT %in% c("Baseline", "Week 2", "Week 4", "Week 6", "Week 8") ) # link to patient profiles if(interactive()) dataPlot$patientProfilePath <- paste0( "patientProfiles/subjectProfile-", sub("/", "-", dataPlot$USUBJID), ".pdf" ) boxplotClinData( data = dataPlot, xVar = "AVISIT", yVar = "AVAL", colorVar = "TRTA", facetVar = "ATPT", title = "Diastolic Blood Pressure distribution by actual visit and analysis timepoint", yLab = "Actual value of the Diastolic Blood Pressure parameter (mmHg)", labelVars = labelVars, pathVar = if(interactive()) "patientProfilePath", pathLab = clinUtils::getLabelVar(var = "USUBJID", labelVars = labelVars), table = FALSE )
To include multiple clinical data visualizations (with or without attached
table) in a loop (in the same Rmarkdown chunk), the list of visualizations
should be passed to the knitPrintListObjects
function of the clinUtils
package.
# consider only restricted set of lab parameters dataPlot <- subset(dataLB, PARAMCD %in% c("SODIUM", "K")) # link to patient profiles if(interactive()) dataPlot$patientProfilePath <- paste0( "patientProfiles/subjectProfile-", sub("/", "-", dataPlot$USUBJID), ".pdf" ) # 1) create plot+table for each laboratory parameter: library(plyr) # for ddply plotsLab <- dlply(dataPlot, "PARAMCD", function(dataLBParam){ paramcd <- unique(dataLBParam$PARAMCD) scatterplotClinData( data = dataLBParam, xVar = "ADY", yVar = "LBSTRESN", aesPointVar = list(color = "TRTP", fill = "TRTP"), aesLineVar = list(group = "USUBJID", color = "TRTP"), labelVars = labelVars, title = paste("Actual value of", getLabelParamcd( paramcd = paramcd, data = dataLBParam, paramcdVar = "PARAMCD", paramVar = "PARAM" ) ), # include link to patient profiles: pathVar = if(interactive()) "patientProfilePath", table = FALSE, # important: each plot should have an unique ID! # for unique relationship of interactivity between plot <-> table id = paste("labProfileLoop", paramcd, sep = "-") ) }) # include this output in the report: listLabels <- getLabelParamcd( paramcd = names(plotsLab), data = dataLB, paramcdVar = "PARAMCD", paramVar = "PARAM" ) clinUtils::knitPrintListObjects( xList = plotsLab, titles = listLabels, titleLevel = 4 )
A watermark can be included in any of the visualization of the package, from a specified file, via the watermark
parameter.
In this example, an exploratory watermark is added to the adverse events barplot.
# create a file with a 'EXPLORATORY' watermark file <- tempfile(pattern = "watermark", fileext = ".png") getWatermark(file = file) # create plot barplotClinData( data = subset(tableAE, AEDECOD != "Total"), xVar = "AEDECOD", yVar = "n", yLab = "Number of patients with adverse events", textVar = "n", labelVars = labelVars, # include the watermark watermark = file, table = FALSE )
Palette for the colors and shapes associated with specific variables can be set
for all clinical data visualizations at once by setting the
clinDataReview.colors
and clinDataReview.shapes
options at the start
of the R session.
Please see the clinUtils
package for the default colors and shapes.
# display default palettes colorsDefault <- getOption("clinDataReview.colors") str(colorsDefault) shapesDefault <- getOption("clinDataReview.shapes") str(shapesDefault)
timeProfileIntervalPlot( data = dataAE, paramVar = "USUBJID", # time-variables timeStartVar = "ASTDY", timeEndVar = "AENDY", colorVar = "AESEV", timeStartShapeVar = "AESTFLG", timeEndShapeVar = "AEENFLG", labelVars = labelVars )
The palettes can be set for all visualizations, e.g. at the start of the R session, with:
# change palettes for the entire R session options(clinDataReview.colors = c("gold", "pink", "cyan")) options(clinDataReview.shapes = clinShapes)
In case the palette contains less elements than available in the data, these are replicated.
timeProfileIntervalPlot( data = dataAE, paramVar = "USUBJID", # time-variables timeStartVar = "ASTDY", timeEndVar = "AENDY", colorVar = "AESEV", timeStartShapeVar = "AESTFLG", timeEndShapeVar = "AEENFLG", labelVars = labelVars )
Palettes are reset to the default patient profiles palettes at the start of a new R session, or by setting:
# change palettes for the entire R session options(clinDataReview.colors = colorsDefault) options(clinDataReview.shapes = shapesDefault)
print(sessionInfo())
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.