```{css echo=FALSE} / Define a margin before h1 element / h1 { margin-top: 3em; } h2 { margin-top: 2em; } h3 { margin-top: 1em; }
```r knitr::opts_chunk$set(warning=FALSE, message=TRUE, comment="#>", dpi=300, out.width="80%")
Before starting, load the ggplot2
and ggcertara
packages, and set the default ggplot
theme to theme_certara()
:
suppressPackageStartupMessages({ library(ggplot2) library(ggcertara) library(kableExtra) #used for the vignette not for ggcertara demonstration }) theme_set(theme_certara())
The ggcertara
package and theme_certara
theme are meant to provide a standardized look to typical gof
plots employed by pharmacometricians. The syntax is based on ggplot2
. The ggcertara
also utilizes the patchwork
package and syntax. These will be touched upon later in this vignette.
The gof()
function uses standard column names in lower case. The following column names are recognized by the functions in this package:
Not all of these need to be present, only those that are needed for the disired plots (for instance, if no plots involving iwres
are requested, then it does not need to be present). The column names must be in lower case.
The gof_read_data()
function simplifies the data loading process by:
mdv
column and, if present, filtering the rows to keep only those with mdv = 0
For this vignette we will be using some sample data that is provided with the package. The data is contained in a file named nmtable.csv
, that was actually generated by NONMEM using a $TABLE
statement (i.e., $TABLE FILE=nmtable.csv FORMAT=,1PE15.8 ...
). It's contents look like this:
knitr::kable(head(nmtable))
In a normal project workflow, the location of the appropriate file would be clear. For the vignette, the file can be located as follows:
file <- system.file(package="ggcertara", "sample-data/nmtable.csv")
Now we load the file:
dat <- gof_read_data(file)
Examining the data, we can see that the names are lower case and the rows have been filtered such that mdv = 0
:
head(dat) table(dat$mdv)
The basic functionality is accessed through the gof()
function:
gof(dat) #Note: output aspect ratio is set to 3:2, in this case width=9, height=6
By default, the gof()
function produces a figure with 6 panels arranged in
a 2-by-3 grid.
dftemp <- data.frame(Column_1=c('DV vs. IPRED', 'DV vs. PRED'), Column_2=c('CWRES vs. TIME','CWRES vs. PRED'), Column_3=c('CWRES vs. TAD', '|IWRES| vs. IPRED')) knitr::kable(dftemp,align = 'c') %>% column_spec(1, width = "15em", background = "white", include_thead = TRUE) %>% column_spec(2, width = "15em", background = "white", include_thead = TRUE) %>% column_spec(3, width = "15em", background = "white", include_thead = TRUE)
Note that all axes are nicely labelled (though they lack units at this point). With this simple function call, we have produced a report ready figure!
The gof()
function can easily produce a variety of panels which can be referenced as either numbered panels, named functions, or as strings.
dftemp <- data.frame(Number=c('1','2','3','4','5','6','7','8','9','10','11, -3','12, -4','13, -5','14, -6','15, -7','16, -8','17, -9','18, -10'), 'Function'=c('gof_cwres_histogram()','gof_cwres_qqplot()','gof_dv_vs_ipred()','gof_dv_vs_pred()','gof_cwres_vs_pred()','gof_cwres_vs_time()','gof_cwres_vs_tad()','gof_absiwres_vs_ipred()','gof_absiwres_vs_time()','gof_absiwres_vs_tad()','gof_dv_vs_ipred(log_xy=T)','gof_dv_vs_pred(log_xy=T)','gof_cwres_vs_pred(log_x=T)','gof_cwres_vs_time(log_x=T)','gof_cwres_vs_tad(log_x=T)','gof_absiwres_vs_ipred(log_x=T)','gof_absiwres_vs_time(log_x=T)','gof_absiwres_vs_tad(log_x=T)'), Name=c('cwres_histogram','cwres_qqplot','dv_vs_ipred','dv_vs_pred','cwres_vs_pred','cwres_vs_time','cwres_vs_tad','absiwres_vs_ipred','absiwres_vs_time','absiwres_vs_tad','dv_vs_ipred_log','dv_vs_pred_log','cwres_vs_pred_log','cwres_vs_time_log','cwres_vs_tad_log','absiwres_vs_ipred_log','absiwres_vs_time_log','absiwres_vs_tad_log'), Description=c('Histogram of CWRES','QQ-plot of CWRES','DV vs. IPRED','DV vs. PRED','CWRES vs. PRED','CWRES vs. TIME','CWRES vs. TAD','|IWRES| vs. IPRED','|IWRES| vs. TIME','|IWRES| vs. TAD','DV vs. IPRED log scale','DV vs. PRED log scale','CWRES vs. PRED log scale','CWRES vs. TIME log scale','CWRES vs. TAD log scale','|IWRES| vs. IPRED log scale','|IWRES| vs. TIME log scale','|IWRES| vs. TAD log scale')) knitr::kable(dftemp,align = 'l') %>% column_spec(1, width = "5em", background = "white", include_thead = TRUE) %>% column_spec(2, width = "15em", background = "white", include_thead = TRUE) %>% column_spec(3, width = "15em", background = "white", include_thead = TRUE) %>% column_spec(4, width = "15em", background = "white", include_thead = TRUE)
Note that the standard 6 plots in the gof()
function correspond to the numbered panels 3, 6, 7, 4, 5, 8 from this list.
panels
option within gof()
Using the panels
argument, select panels can be selected using either the respective panel numbers or names (as strings). For example, to prduce both a histogram and QQ-plot of CWRES, select panels 1 and 2 or callthem by name cwres_histogram
and cwres_qqplot
:
gof(dat, panels=1:2) gof(dat, panels=c('cwres_histogram', 'cwres_qqplot'))
For log-scale plots (panels 11-18 above), we can call their panel number or add a -
to their linear scale counterpart. For example, to obtain a plot of DV vs. IPRED on log-log scale, we can set panel=11
or panel=-3
:
gof(dat, panels=c(11,-3))
We can use the layout
option to specify the number of rows and columns in a multi-panel output.
gof(dat, panels=c(3, -3, 4, -4, 5, 6), layout = c(3, 2))
We can also call specific panels using the corresponding named function. Log variants of the plot can be obtained by specifying either log_x=T
or log_xy=T
depending on the function.
gof_dv_vs_ipred(dat) gof_dv_vs_ipred(dat, log_xy=T)
gof_list
objectIndividual panels can also be with the gof_list()
function. This returns a numbered list of the 6 standard panels by default, or all panels if all=T
is specified:
p <- gof_list(dat, all=T) p
Individual panels can be extracted from the list:
p[[5]]
gof_list
can also be used to build a new list of plots, by specifying empty=T
:
p <- gof_list(empty = T) p[[1]] = gof_dv_vs_ipred(dat) p[[2]] = gof_dv_vs_pred(dat)
A composite plot can be obtained from a list using the gof_layout()
function:
gof_layout(p)
There are several ways to customize the plot axis labels/units. When a recognized variable name is used in a plot (e.g. ipred
, cwres
, etc.), the functions will find a corresponding axis label in a list object which by default corresponds to a global option:
dftemp <- data.frame(Option=c('gof.label.dv','gof.label.pred','gof.label.ipred','gof.label.cwres','gof.label.iwres','gof.label.time','gof.label.tad','gof.units.dv','gof.units.time','gof.units.tad'), Default=c('Observed Value','Population Prediction','Individual Prediction','Conditional Weighted Residual','Individual Weighted Residual','Time','Time After Dose','NA','NA','NA')) knitr::kable(dftemp,align = 'l',) %>% kable_styling(full_width = F) %>% column_spec(1, width = "10em", background = "white", include_thead = TRUE) %>% column_spec(2, width = "15em", background = "white", include_thead = TRUE)
Every function has a labels
argument to override the default, should it be necessary. Also note when updating labels/units, it is not required to specify all labels/units, only the labels/units that are being updated need be specified.
We can update labels and units at the global level by calling the desired option within the options
function. Units for concentration (dv, pred, ipred), time, and tad can also be set using global options and will be appended to the label:
options(gof.label.cwres = 'CWRES', gof.label.dv = 'MyDrug', gof.units.dv="ng/mL", gof.units.time="h", gof.units.tad="h") gof(dat)
We can update a default label for a single plot by calling the label
option as follows.
gof(dat, labels = gof_labels(dv='Obs'))
Finally, the usual labs()
function can be used to set or modify the labels of any plot. In this case, units do not get applied and must be explicitly included in the labels.
gof(dat, 6) + labs(x="Something silly", y="Why not? (light years)")
Now let's restore our defaults before moving on...
# Restore defaults options(gof.label.dv = "Observed Value", gof.label.pred = "Population Prediction", gof.label.ipred = "Individual Prediction", gof.label.cwres = "Conditional Weighted Residual", gof.label.iwres = "Individual Weighted Residual", gof.label.time = "Time", gof.label.tad = "Time After Dose", gof.units.dv=NA, gof.units.time=NA, gof.units.tad=NA )
To highlight outliers using a special symbol (and color), you first define
a column in the dataset which is a factor with 2 levels (be mindful of the
labels, as these will appear in the legend and also note the use of unicode
language for special symbols). The sample data comes from a simulation, and
doesn't have any outliers, but for demonstration purposes we will pretend that
|CWRES| > 2.5
is a criterion for defining outliers:
dat$outlier <- factor(abs(dat$cwres) > 2.5, labels=c("|CWRES| \U{2264} 2.5", "|CWRES| > 2.5"))
Then, use the highlight
option, like this:
gof_cwres_vs_pred(dat, highlight=outlier) + scale_y_continuous(breaks = c(-2.5,0,2.5))
This feature should be used carefully. Note that highlighting values also populates a legend at the bottom, left corner of the plot area. It works for multiple panels too:
gof(dat, highlight=outlier)
Faceting can be applied with facet_wrap
or facet_grid
as with any ggplot
object. For this example, we will augment our data with some covariate information, specifically sex and study information. In the same folder as the dataset nmtable.csv
is a file pkpop.csv
that contains covariate information on PK population.
file <- system.file(package="ggcertara", "sample-data/pkpop.csv") pkpop <- read.csv(file) head(pkpop)
Note that there is one row for each id
in the dataset. We need to link this
to our goodness-of-fit data by matching on the id
column. There are many ways
to do this (merge()
, left_join()
, ...), but a simple way is to use the
basic match()
function:
i <- match(dat$id, pkpop$id) dat$sex <- pkpop$sex[i] dat$study <- pkpop$study[i]
We have now added sex and study to our dataset, but in order to get nice, informative labels, we should convert these to factor
:
dat$sex = factor(dat$sex, levels=0:1, labels=c('Female', 'Male')) dat$study = factor(dat$study, levels=1:3, labels=c('Study A', 'Study B', 'Study C'))
Now, we can produce a plot of DV vs. IPRED faceted by sex as follows:
gof(dat, panels = 3) + facet_wrap(~sex)
Changes that will affect all plots can be made with the update_geom_defaults()
function. For example, to change the alpha level used for points in all panels:
update_geom_defaults("point_c", list(alpha=0.05)) # update the global options gof(dat) update_geom_defaults("point_c", list(alpha=0.4)) # reset to default alpha
Note that a special custom geom, named geom_point_c()
, is defined and used in
this package, so that other ggplot
code will not be affected.
For a single panel, you can just modify the ggplot
object. For example,
let's change the color and symbol of points in a DV vs. PRED plot to be
different by study (which we already added to our dataset earlier):
update_geom_defaults("point_c", list(alpha=0.5, size=2.5)) # make points bigger and more opaque gof(dat, panels=4) + aes(color=study, shape=study) + labs(color=NULL, shape=NULL) + scale_color_certara()
For a patchwork
object (multiple panels), you can do the same thing by
replacing the +
operator with a &
:
gof(dat, panels=3:4) & aes(color=study, shape=study) & labs(color=NULL, shape=NULL) & scale_color_certara() update_geom_defaults("point_c", list(alpha=0.4, size=1.5)) # restore defaults
patchwork
is smart enough to produce a single legend for both panels (note:
the theme_certara()
theme must be set globally, as it was at the beginning of
this script, for this to display properly).
As with any ggplot object, geoms can be added. Remember to use the &
operator instead of +
when creating multiple panels (this notation comes from the patchwork
package).
For examples, we can add a dashed line with corresponding label showing the LLOQ to our DV vs. IPRED and PRED plots as follows:
gof(dat, panels = -(3:4)) & geom_hline(yintercept=1.2, linetype="dashed") & geom_text(data=data.frame(x=Inf, y=1.2), aes(x=x, y=y), label="LLOQ = 1.2 ng/mL", hjust=1.1, vjust=-0.5, size=3)
As another example, we can add a blue linear regression line in addition to the red LOESS smooth to our panels:
gof(dat, panels=8:9) & geom_smooth(method="lm", color=certara_pal()[1], se=F, size=1)
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.