library(knitr) tmpdir <- tempdir() knitr::opts_chunk$set( message = FALSE, # stop document execution if error (not the default) error = FALSE, # stop-on-error fig.align = "center", fig.path = file.path(tmpdir, "./figures_vignette/"), echo = TRUE )
This package clinUtils
contains functionalities
to facilitate the analysis of clinical datasets in R.
library(clinUtils) # packages required for the examples in the vignette library(ggplot2) library(pander) library(htmltools) library(plyr)
Please note that the interactive outputs (table/visualization) in this vignette are only included if Pandoc is available (pre-requisite for R Markdown).
The haven
R package enables to load
SAS datasets in sas7bdat
and xpt
formats.
The function loadDataADaMSDTM
is a wrapper to import multiple
SAS datasets in sas7bdat
or xpt
format at once.
It returns a list of dataset (once by domain).
The variable labels are combined across datasets and available in
a dedicated attribute.
pathExampleDatasets <- list.files( path = system.file("extdata", "cdiscpilot01", "SDTM", package = "clinUtils"), pattern = "*.xpt", full.names = TRUE ) data <- loadDataADaMSDTM(files = pathExampleDatasets) # A list is returned, each separated file is accessible via data$[fileName] pander(head(data$DM, 2)) pander(head(data$LB, 2)) pander(head(data$AE, 2)) # Access labels for all variables labelVars <- attr(data, "labelVars") head(labelVars) # Access label for a particular variable: labelVars["USUBJID"]
To demonstrate the functionalities of this package, a subset of the datasets from the CDISC Pilot 01 study is available in the package as dataset.
# load data data(dataADaMCDISCP01) dataADaM <- dataADaMCDISCP01 names(dataADaM) pander(head(dataADaM$ADSL, 2)) pander(head(dataADaM$ADLBC, 2)) pander(head(dataADaM$ADAE, 2)) # and variable labels labelVarsADaM <- attr(dataADaM, "labelVars") head(labelVarsADaM)
# load data data(dataSDTMCDISCP01) dataSDTM <- dataSDTMCDISCP01 names(dataSDTM) pander(head(dataSDTM$DM, 2)) pander(head(dataSDTM$LB, 2)) pander(head(dataSDTM$AE, 2)) # and variable labels labelVarsSDTM <- attr(dataSDTM, "labelVars") head(labelVarsSDTM)
Typical data stored in SAS contains the label for each variable available in the dataset.
The function getLabelVar
extracts the label for the specified variable(s),
ensuring that the variable code is used if no label is specified.
# variable label is extracted from 'labelVars' getLabelVar(var = "AEDECOD", labelVars = labelVars)
This function also supports extraction of variable labels
from the 'label' attribute of the column, as in the tibble as
returned by the read_sas
/read_xpt
functions from the
haven
package.
In typical basic CDISC dataset (a.k.a BDS: Basic Data Structure),
as laboratory, vital signs datasets, variables
are available to store parameter and parameter code
(PARAM
/PARAMCD
in ADaM).
The function getLabelParamcd
get the label from specific parameter(s) code.
Variables with parameter name and code are based by default on
standard ADaM CDISC parameter variables (PARAM
/PARAMCD
).
# For ADaM dataset getLabelParamcd(paramcd = "CHOL", data = dataADaM$ADLB) getLabelParamcd(paramcd = "BILI", data = dataADaM$ADLB) # For SDTM dataset getLabelParamcd(paramcd = "CHOL", data = dataSDTM$LB, paramcdVar = "LBTESTCD", paramVar = "LBTEST") getLabelParamcd(paramcd = "BILI", data = dataSDTM$LB, paramcdVar = "LBTESTCD", paramVar = "LBTEST")
Palette for typical CDISC-variable(s) are available in the package.
Meaningful colors and symbols for a Normal Reference Range Indicator
CDISC variable (-NRIND
) are extracted via the colorPaletteNRIND
and shapePaletteNRIND
respectively:
print(colorPaletteNRIND) print(shapePaletteNRIND)
plot( x = seq_along(colorPaletteNRIND), col = colorPaletteNRIND, bg = colorPaletteNRIND, pch = shapePaletteNRIND ) text( x = seq_along(colorPaletteNRIND), labels = names(colorPaletteNRIND), pos = 3 ) title("Palette for CDISC normal reference range indicator")
The getPaletteCDISC
function extracts such palette for a specified
variable.
This retains only the categories available in the variable, and ensures that extra symbols are extracted in case non standard categories are available in the data.
dataPlot <- subset(dataSDTM$LB, LBTEST == "Leukocytes") colorPalette <- getPaletteCDISC(x = dataPlot$LBNRIND, var = "NRIND", type = "color") print(colorPalette) shapePalette <- getPaletteCDISC(x = dataPlot$LBNRIND, var = "NRIND", type = "shape") print(shapePalette) # visualize profile over time gg <- ggplot(data = dataPlot) + geom_point(aes(x = LBDY, y = LBSTRESN, color = LBNRIND, fill = LBNRIND, shape = LBNRIND)) + ggtitle("Evolution of Leukocytes actual value over time") print(gg) # use 'standard' symbols/colors # ('limits' is only required if the categories are not already ordered in LBNRIND) gg + scale_color_manual(values = colorPalette, limits = names(colorPalette)) + scale_fill_manual(values = colorPalette, limits = names(colorPalette)) + scale_shape_manual(values = shapePalette, limits = names(colorPalette))
Default palettes for visualizations are included in the package.
Palettes can be extracted based on a variable, or number of elements.
These packages are included for consistency across the entire suite of R packages.
dataPlot <- subset(dataADaM$ADLB, PARAMCD == "CHOL") # extract palettes colorPalette <- getColorPalette(x = dataPlot$USUBJID) shapePalette <- getShapePalette(x = dataPlot$USUBJID) linetypePalette <- getLinetypePalette(x = dataPlot$USUBJID) # create the plot ggplot(data = dataPlot, aes(x = ADY, y = CHG, color = USUBJID)) + geom_point(aes(shape = USUBJID)) + geom_line(aes(linetype = USUBJID, group = USUBJID)) + scale_color_manual(values = colorPalette) + scale_shape_manual(values = shapePalette) + scale_linetype_manual(values = linetypePalette) + labs(x = "Relative day", y = "Change from baseline", title = "Profile plot of cholesterol change from baseline")
Custom palettes can be specified via the palette
parameter.
In R, numbers are by default rounded to the even digit for rounding off a 5.
Numbers can be rounded based on the 'rounding up' strategy for
rounding off a 5 with roundHalfUp
& roundHalfUpTextFormat
.
This is useful when statistics created with the SAS
software
should be reproduced in R.
# round up roundHalfUp(c(0.45, 0.55), 1) # versus R default: round(c(0.45, 0.55), 1)
The getClinDT
function is an utility function, based on the
the DT package, with sensitive
default settings and extra common functionalities of interest
for data in clinical trials, as listing or summary tables of
descriptive statistics.
There are built-in functionalities to expand row variable(s) to display patient-specific information or include bar visualization.
dataTEAE <- subset(dataADaM$ADAE, SAFFL == "Y" & TRTEMFL == "Y") # set column names to labels labelVarsTEAE <- getLabelVar( var = colnames(dataTEAE), labelVars = labelVarsADaM ) colnamesTEAE <- setNames(names(labelVarsTEAE), labelVarsTEAE) dataTEAE <- dataTEAE[order(dataTEAE$AESOC), ]
getClinDT( dataTEAE, colnames = colnamesTEAE, rowGroupVar = c("AESOC"), barVar = "AGE", barRange = c(0, 100), caption = "Listing of treatment-emergent adverse events on the safety analysis set" )
In clinical trials, datasets are typically delivered in successive batches, depending on the patient recruitment and the different milestones of the study.
The changes between successive data deliveries can be compared with the compareTables
function.
# Build example dataset with treatment-emergent adverse events # of multiple batches varsListing <- c("USUBJID", "AESOC", "AEDECOD", "ASTDT", "AESEV", "AEOUT") dataTEAEListing <- dataTEAE[, varsListing] # simulate removal of observations in new batch dataTEAENew <- dataTEAE[-sample.int(n = nrow(dataTEAEListing), size = 3), ] # simulate addition of observations in new batch dataTEAEOld <- dataTEAE[-sample.int(n = nrow(dataTEAEListing), size = 3), ] # simulate change of observations dataTEAEOld[seq_len(2), "AESEV"] <- "SEVERE" refVars <- c("USUBJID", "AESOC", "AEDECOD", "ASTDT") tableComparison <- compareTables( newData = dataTEAENew, oldData = dataTEAEOld, referenceVars = refVars, changeableVars = setdiff(colnames(dataTEAEListing), refVars), # parameters passed to datatable colnames = setNames(names(labelVarsADaM), labelVarsADaM) )
The new, old datasets with change information, or the difference
between datasets are extracted.
See documentation of outputType
parameter for further details.
The table below highlight the differences between the datasets in an interactive table.
tableComparison$`table-comparison-interactive`
Dedicated functions: knitPrintListPlots
and knitPrintListObjects
are available to include a list
of plots/objects in a Rmarkdown
document,
enabling to specify any (knitr) chunk
options and title header for each object.
This function inserts each object in a separated code chunk, such as independent option (as figure dimensions) can be specified for each object.
Please note that this function should be used within
a chunk having the option: results = 'asis'
.
ggplot2
)For example, by default in knitr the options to specify figure dimensions
should be the same for all plots generated from the same chunk (fig.height
/fig.width
).
Plots can be included with different specified dimension with: knitPrintListPlots
.
dataLB <- subset(dataSDTM$LB, LBTESTCD %in% c("ALB", "ALT", "CHOL", "HCT", "KETONES", "PH") ) dataLB$ACTARM <- dataSDTM$DM$ACTARM[match(dataLB$USUBJID, dataSDTM$DM$USUBJID)] # create plots: listPlotsLB <- plyr::dlply(dataLB, "LBCAT", function(data) ggplot(data = data) + geom_histogram(aes(fill = LBNRIND, x = ACTARM), stat = "count", position = "dodge") + facet_wrap(~LBTEST) + theme(axis.text.x = element_text(angle = -45, hjust = 0)) ) # n2mfrow: extract default dimensions for a specified number of plots figDim <- plyr::dlply(dataLB, "LBCAT", function(data) n2mfrow(length(unique(data$LBTESTCD))) ) knitPrintListPlots( plotsList = listPlotsLB, generalLabel = "lab-hist-static", type = "ggplot2", # set caption for each figure fig.cap = paste("Barplot of", tolower(names(listPlotsLB)), "measurements"), # specify different dimensions fig.width = sapply(figDim, "[[", 1) * 2 + 1, # 3 in for each plot + 1 in for the legend fig.height = sapply(figDim, "[[", 2) * 2 + 2, # 3 in for each plot + 2 for x-axis labels # include title before each visualization titles = simpleCap(tolower(names(listPlotsLB))), titleLevel = 4 )
plotly
)A list of interactive figures is created with the plotly
package:
library(plotly) listPlotsInteractiveLB <- sapply(listPlotsLB, function(ggplot) ggplotly(ggplot) %>% partial_bundle() , simplify = FALSE)
htmltools::tagList
A list of interactive figures can be included within a
Rmarkdown document with the tagList
function of the htmltools
package.
tagListArgs <- mapply(list, # section header lapply(names(listPlotsInteractiveLB), htmltools::h4), # interactive plots listPlotsInteractiveLB, SIMPLIFY = FALSE ) tagListArgs <- unlist(tagListArgs, recursive = FALSE) do.call(htmltools::tagList, tagListArgs)
The function knitPrintListPlots
with the type
set to 'plotly' enables to include additionally e.g. a caption or a title.
knitPrintListPlots( plotsList = listPlotsInteractiveLB, generalLabel = "lab-hist-interactive", type = "plotly", # include title before each visualization titles = simpleCap(tolower(names(listPlotsInteractiveLB))), titleLevel = 5 )
flextable
)The flextable
package enables to create highly customizable tables for Word/PowerPoint
format (among others).
The function knitPrintListObjects
enables to include
a list of flextable
objects within a single chunk.
Please note that the following chunk option should be used: results = 'asis'
.
library(flextable) listFtLB <- plyr::dlply(dataLB, "LBCAT", function(dataParcat){ flextable::flextable(data = head(dataParcat)) }) knitPrintListObjects( xList = listFtLB, generalLabel = "lab-listing-ft", titles = simpleCap(tolower(names(listFtLB))), # different alignment for each table ft.align = c("center", "right", "left"), titleLevel = 4 )
Please note that the table alignment option is more visible in a Word output format.
pander(sessionInfo())
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.