The VPC is the most widely used diagnostic tool in pharmacometrics (see e.g. here), and is commonly created using PsN and Xpose, using NONMEM as the simulation engine. The aim of the current R library is to provide an improved tool that is:

Installation

library(devtools)
install_github("ronkeizer/vpc")
library(vpc)
    eval_all <- TRUE
    # library("devtools")
    # install_github("ronkeizer/vpc")

How to use

Functions:

The main arguments to these function are the sim and obs arguments, which specify a simulation dataset and an observation dataset. All other arguments can be used to customize data parsing and visual appearance of the VPC such as stratification and binning. All four functions will return a ggplot2 object.

Examples

Load the library and get the observation data and simulation data. In the first example, we'll use a dataset that's available in R by default (Theophylline) and generate the simulation dataset in R.

    library(dplyr)

    ## Load the theophylline PK dataset
    obs <- Theoph
    colnames(obs) <- c("id", "wt", "dose", "time", "dv")
    obs <- obs %>%
            dplyr::group_by(id) %>%  
            mutate(sex = round(runif(1))) # randomly assign a "sex" covariate
    sim <- sim_data(obs, # the design of the dataset
                    model = function(x) { # the model
                      pk_oral_1cmt (t = x$time, dose=x$dose * x$wt, ka = x$ka, ke = x$ke, cl = x$cl * x$wt)
                    },
                    error = list(additive = 0.1),
                    theta = c(2.774, 0.0718, .0361),                 # parameter values
                    omega_mat = c(0.08854,                           # specified as lower triangle by default;
                                  0.02421, 0.02241,                  # note: assumed that every theta has iiv, set to 0 if no iiv.
                                  0.008069, 0.008639, 0.02862),      
                    par_names = c("ka", "ke", "cl"),                 # link the parameters in the model to the thetas/omegas
                    n = 500)

However, instead we could use observation and simulation data from NONMEM, e.g. (not run):

obs <- read_table_nm("sdtab1")   # an output table with at least ID, TIME, DV
sim <- read_table_nm("simtab1")  # a simulation file with at least ID, TIME, DV

The read_table_nm() function comes with the vpc library and is a fast way to read in output data created from the $TABLE record in NONMEM, including tables with multiple subproblems.

Next, the VPC can simply be created using:

vpc (sim = sim, obs = obs)

Stratified by SEX:

vpc (sim = sim, obs = obs, stratify = c("sex"), facet = "rows")

The vpc functions in this packages allow stratification by 2 variables (horizontal and vertical faceting). If more strata are desired, a new combined variable should be defined that defines the factors.

Note: The output from the vpc() functions are ggplot2-objects, and they can easily be extended and themed using the well-known functions from the ggplot2 package. One cautionary note however is that you should refrain from applying any additional faceting functions after the VPC has been created. In other words, faceting should be done using the vpc() functions (stratify=... argument). Otherwise you will end up with funky (in the best case) or misleading (in the worst case) plots.

Pred-correction, and plotting of data:

vpc (sim = sim, obs = obs, pred_corr = TRUE, show=list(obs_dv = TRUE, obs_ci = TRUE))

With more explicit use of options, and saving the object:

vpc(sim = sim,
           obs = obs,                                   # supply simulation and observation dataframes
           obs_cols = list(
             dv = "dv",                               # these column names are the default,
             idv = "time"),                            #   update these if different.
           sim_cols = list(
             dv = "sdv",
             idv = "time"),
           bins = c(0, 2, 4, 6, 8, 10, 16, 25),             # specify bin separators manually
           stratify = c("sex"),                         # multiple stratifications possible, just supply as vector
           pi = c(0.05, 0.95),                          # prediction interval simulated data to show
           ci = c(0.05, 0.95),                          # confidence intervals to show
           pred_corr = FALSE,                           # perform prediction-correction?
           show = list(obs_dv = TRUE),                              # plot observations?
           facet = "rows",                              # wrap stratifications, or as "row" or "column"
           ylab = "Concentration",
            xlab = "Time (hrs)")

Note: If you imported the data from NONMEM, the VPC function will automatically detect column names from NONMEM, such as ID, TIME, DV. If you simulated data in R or got the data from a different software, you will probably have to change the variable names for the dependent and independent variable, and the individual index.

Using the show argument, you can instruct the vpc function what plot elements to show. The defaults are:

  obs_dv = FALSE        # the observed data
  obs_ci = TRUE         # the confidence interval of the observed data
  obs_median = TRUE     # the median of the observed data
  sim_median = FALSE    # the median of the simulated data
  sim_median_ci = TRUE  # the confidence interval around the median of the simulated data
  pi = FALSE            # the prediction interval quantiles
  pi_ci = TRUE          # the confidence interval around the prediction interval quantiles
  pi_as_area = FALSE    # show the PI as area instead of two lines
  bin_sep = TRUE        # show the bin separaters (as geom_rug)

For more information about these elements of the VPC, and e.g. the difference between confidence intervals and prediction intervals, and why it is usually important to show both, please have a look at e.g. this tutorial about VPCs.

Additionally, colors, fills, transparency (alpha), and linetypes and sizes can be changed easily using the vpc_theme argument and function. More general plot theming can be accomplished by supplying a ggplot2 theme to the ggplot_theme argument.

vpc(sim, obs,
    vpc_theme = new_vpc_theme (list(
       sim_pi_fill = "#aa6666", sim_pi_alpha = 0.15,
       sim_median_fill = "#66aa66", sim_median_alpha = 0.3,
       obs_ci_color = "red", obs_ci_linetype = 'solid',
       bin_separators_color = NA)),
    ggplot_theme = theme_empty
)

Binning

The vpc functions currently provide three options for binning (bins= argument):

Automatic determination of optimal bin number (as described e.g. by Lavielle et al. and Sonehag et al.) will be provided soon.

Censored data

The vpc_cens() function can be used to create a VPC for the probability of left- or right-censoring such as in the case of data ULOQ. There is no need to add a variable to the dataset to flag the censored cases, the function only requires the lloq or uloq. The example below artificially induces an LLOQ of 5 for the above model / dataset, and generates a VPC for the probability of left-censoring.

vpc_cens(sim, obs, lloq = 5)

Categorical data

VPCs can also be made for categorical data. These show the probability of observing a specific discrete outcome (or count), and plot this versus the simulated probability, shown as a confidence interval.

if(file.exists(paste0(system.file(package="vpc"), "/extdata/sdtab45"))) {
  obs_cat <- read_table_nm(file=paste0(system.file(package="vpc"), "/extdata/sdtab45"))
  sim_cat <- read_table_nm(file=paste0(system.file(package="vpc"), "/extdata/simtab45"))
  vpc_cat (sim = sim_cat, obs = obs_cat,
           obs_cols = list(dv = "SMXH"), sim_cols = list(dv = "SMXH"))
}

Time-to-event data

Similar to the VPC for continuous data, the VPC for TTE data requires simulated data. In general, there are two distinct approach to simulate survival data:

An example for time-to-event data is shown below. The datasets are supplied with the vpc library.

data(rtte_obs_nm)
data(rtte_sim_nm)

Treat RTTE as TTE, no stratification:

vpc_tte(sim = rtte_sim_nm,          
        obs = rtte_obs_nm,
        rtte = FALSE,
        sim_cols = list(dv = "dv", idv="t"),
        obs_cols = list(idv = "dt"))

Stratified for covariate and study arm, and binned and smooth:

vpc_tte(sim = rtte_sim_nm,
        obs = rtte_obs_nm,
        stratify = c("sex","drug"),
        rtte = FALSE,
        bins = "kmeans",
        n_bins = 20,
        smooth = TRUE,
        sim_cols = list(dv = "dv", idv="t"),
        obs_cols = list(idv = "dt"))

Stratified for event number (RTTE) and study arm:

vpc_tte(sim = rtte_sim_nm,
        obs = rtte_obs_nm,
#        show = list(obs_cens = FALSE),
        rtte = TRUE, rtte_calc_diff = TRUE, events = c(1:3),
        stratify=c("drug"),
        bins = "time",
        n_bins = 20,
        smooth = TRUE,
        sim_cols = list(dv = "dv", idv="t"),
        obs_cols = list(idv = "t"), verbose=TRUE)

Kaplan-Meier Mean Covariate plots (KMMC)

vpc_tte(sim = rtte_sim_nm,
        obs = rtte_obs_nm,
        rtte = FALSE,
        bins=c(10, 20, 40, 60, 80, 100), # binning is recommended with KMMC
        kmmc = "sex",
        sim_cols = list(dv = "dv", idv="t"),
        obs_cols = list(idv = "dt"))


ronkeizer/vpc documentation built on May 11, 2023, 11:09 p.m.