README.md

Overview

The SimcypConsultancy R package provides tools for analyzing and reporting PBPK data from the Simcyp Simulator. Our goals in making this package are to:

  1. Increase accuracy and efficiency overall
  2. Easily and accurately make informative tables from Simulator output
  3. Create scientifically informative, professional-quality graphs
  4. Perform additional analyses of PBPK data

Important: The SimcypConsultancy R package is separate from the Simcyp R package and serves a different purpose. The SimcypConsultancy package does not interact with the simulator and instead is focused on report-writing tasks.

Installation

You can install the SimcypConsultancy package from GitHub like this:

devtools::install_github(repo = "shirewoman2/Consultancy", 
                         upgrade = "never")

Current version: 3.8.9

NOTE: The SimcypConsultancy package requires that you have tidyverse loaded.

A great place to start for getting help:

launch_package_index()

How to use this document

You do not need to read this document from start to finish to follow it. Instead, please do skip around to only the parts that interest you.

TABLE OF CONTENTS

We’re using example files in this document, but you generally can substitute whatever Simcyp Simulator files you’re working with instead of the examples. For help on any functions, type a question mark followed by the name of the function. For example, here’s how to get help on the ct_plot function:

?ct_plot

Let’s dive into some examples!

Check how simulations were set up, e.g., for QCing

As we mentioned, one of our goals was to increase accuracy. For this reason, several functions in the SimcypConsultancy package are set up for checking or reporting data in an automated fashion.

extractExpDetails example 1: Check on how all the simulations in a given folder were set up

First, we’ll extract data about how the simulations were set up from the simulation Excel results files and from the workspaces themselves.

Details <- extractExpDetails_mult(sim_data_files = NA, 
                                  exp_details = "all")

Now that we’ve got that information pulled into memory, we can save the data in an Excel file to allow for easy comparisons between simulations. Say you want to know some details about how the experiment was set up such as the start and end times of the simulation, what the substrates and inhibitors were and the dosing regimens. You can use the function “annotateDetails” to do this sort of task. Here’s how you’d get the first few rows with information on the trial design:

annotateDetails(Details, simulator_section = "Trial Design") %>%
   head() %>% formatTable_Simcyp(fontsize = 8)

extractExpDetails example 2: Narrow down the data to focus on one aspect of ADME, one particular compound, etc.

How about some information on what parameters were used for setting up absorption but only for the substrate? We’re again only going to show the first few rows of this, which is what “head() %>%” does in the code below.

annotateDetails(Details, 
                simulator_section = "Absorption",
                compoundID = "substrate") %>% 
   head() %>% formatTable_Simcyp(fontsize = 8)

To see all the possible experimental details, please see the file All possible experimental details to use with extractExpDetails.csv (Bold text is a link) or type this into the console:

 view(ExpDetailDefinitions)

For more options and examples, please see the help files for extractExpDetails, extractExpDetails_mult, and annotateDetails and please see Checking-simulation-experimental-details.docx (bold text is a link)

Return to TOC

Timesavers for setting up workspaces

If someone else set up some workspaces that they would like you to run but those workspaces included observed data XML files on OneDrive, the Simulator won’t be able to find that file unless you change the path to include your user name instead of the original person’s user name. Here’s how to quickly change all the paths for all the workspaces in a folder so that the Simulator can find the observed data files with your name in the path:

make_xml_path_mine()

That will permanently change the workspaces, so please make sure that’s what you want.

If you have run some simulations using the workflow tool in the Simulator, it will include a date/time stamp on the output file name. To remove those quickly, try this to remove that stamp from all the files in your current folder:

remove_file_timestamp()

The R Working Group has a few other tricks up our sleeves for changing workspace parameters quickly, but they do require some care since they permanently change workspace files and since only some parameters work well when changed in an automated fashion. If you find yourself needing to change something like the dose, the inhibition parameters, the induction parameters, etc. for a bunch of workspaces and don’t want to spend hours tediously opening, changing, saving, and closing them all manually, please talk to Laura Shireman.

Return to TOC

Make concentration-time plots for one simulation at a time

The function “ct_plot” will allow you to automatically graph almost any concentration-time data present in Simulator output Excel files. While this function has many options, the default settings will generally create decent graphs that comply with Consultancy Team report templates.

ct_plot example 1: Substrate data showing the arithmetic means for each simulated trial

Let’s make a graph of substrate concentration-time data in plasma, including some observed data, and let’s save the output as “My conc-time graph 1.png”.

First, extract the data from the Simulator output file and also from either the observed-data XML overlay file or the Excel PE template file. We’re using the function “extractConcTime” to extract the data and the function “ct_plot” to make the graphs.

LMV <- extractConcTime(sim_data_file = "letermovir MD.xlsx", 
                       obs_data_file = "Observed data files/letermovir obs data.xlsx")

ct_plot(ct_dataframe = LMV, 
        figure_type = "trial means", 
        save_graph = "My conc-time graph example 1.png")

ct_plot example 2: Zoom in on the last two doses

Let’s zoom in on the last two doses – doses 7 and 8 here – and change the figure type to get percentiles instead of trial means. We’ll save this file, too, and this time, let’s save it as a Word file so that we can get figure heading and caption data already filled in and ready for copying and pasting into a report. If you supply the output from the function “extractExpDetails_mult” here, you’ll get a more informative figure heading and caption. Any text that you’ll want to edit when you paste this into a report will be in bold.

ct_plot(ct_dataframe = LMV, 
        time_range = "doses 7 to 8", 
        figure_type = "percentiles", 
        existing_exp_details = Details, 
        save_graph = "My conc-time graph example 2.docx")

ct_plot example 3: Substrate data with vs. without a perpetrator drug

Next, let’s make a graph of substrate concentration-time data in plasma with and without the presence of a compound in the Inhibitor 1 position in the simulation. First, just like with the other graph, we’ll extract the data from our simulation Excel results file and then we’ll actually make the graph.

Let’s change the figure type to show only the mean predicted concentrations. Let’s include a legend, label it for the fact that our perpetrator molecule is an inducer and not an inhibitor (the default), and let’s only make the semi-log plot here. We’ll save the file so that it is 3 inches high and 5 inches wide. (All of these options are described in more detail in the example files in the folder “Concentration-time plots 1 - one sim at a time”.

MDZ <- extractConcTime(sim_data_file = "QD MDZ QD RIF.xlsx", 
                       obs_data_file = "Observed data files/QD MDZ QD RIF fake observed data.xlsx", 
                       existing_exp_details = Details)

ct_plot(ct_dataframe = MDZ, time_range = "last dose", 
        figure_type = "means only", 
        linear_or_log = "log",
        legend_position = "right",
        legend_label = "Inducer",
        save_graph = "My conc-time graph example 3.docx", 
        fig_height = 3, fig_width = 5)

For more options and examples, please see the help file for ct_plot (type ?ct_plot into the console) and see “Concentration-time-plot-examples-1.docx”.

Return to TOC

Make concentration-time plots comparing multiple data sets

You can also make concentration-time plots with multiple data sets overlaid for easy comparisons.

ct_plot_overlay example: Compare substrate concentrations in multiple tissues

First, let’s get some data. We’re using a variation on the function “extractConcTime” called “extractConcTime_mult” that allows us to pull data for multiple files, tissues, and compounds all at once.

CT <- extractConcTime_mult(
   sim_data_files = c(
      "Example simulator output SD MDZ plus BID RTV.xlsx"), 
   tissues = c("plasma", "blood", "unbound plasma"), 
   ct_dataframe = "CT",
   compoundsToExtract = c("substrate", "inhibitor 1"), 
   existing_exp_details = Details)

Next, graph those data, coloring the lines by whether the perpetrator was present and breaking up the graph into small multiples by the tissue and by what compound we’re plotting. Let’s give a sense of the variability of the data by showing transparent bands for the 5th to 95th percentiles. We’ll save this graph as “My overlaid conc-time graph example 1.docx”. Since we’re saving this as a Word file, we’ll again get some figure heading and caption text filled in.

ct_plot_overlay(ct_dataframe = CT, colorBy_column = Inhibitor,
                facet1_column = Tissue, 
                facet2_column = Compound, 
                figure_type = "percentile ribbon",
                color_set = "Set 1",
                save_graph = "My overlaid conc-time graph example 1.docx", 
                fig_width = 6, fig_height = 6)

For more options and examples, please the help file for ct_plot_overlay (type ?ct_plot_overlay into the console) and see “Concentration-time-plot-examples-3.docx”.

Return to TOC

Make graphs of enzyme abundance

You can apply most of the settings from the “ct_plot” function to another function, “enz_plot”, which will automatically make enzyme abundance plots for you. Just as with the ct_plot function, we’re first going to extract the data using a separate function, this one called “extractEnzAbund”.

For this graph, let’s include a legend, color the lines according to whether the perpetrator is present using the colors we want, note in the legend that the perpetrator is an inducer rather than the default inhibitor, and let’s save the output, too.

CYP3A4_liver <- extractEnzAbund(sim_data_file = "Example simulator output - MD MDZ MD EFV.xlsx", 
                                existing_exp_details = Details)

enz_plot(CYP3A4_liver,
         figure_type = "percentile ribbon", 
         line_type = "solid", 
         legend_position = "right", legend_label = "Inducer", 
         line_color = c("dodgerblue3", "darkorchid4"), 
         save_graph = "Enzyme abundance graph example.png", 
         fig_height = 3.5, fig_width = 6)

For more options and examples, please the help file for enz_plot (type ?enz_plot into the console) and see “Enzyme-abundance-plot-examples.docx”.

Return to TOC

Make nice-looking PK summary tables automatically

One of the tasks we would like to automate and make less prone to copy/paste errors is creating tables for reports that summarize simulated PK parameters. The function pk_table does that.

pk_table example 1: Create a table of standard PK parameters for a simulation of a substrate alone

For this, we’ll use a file from a hands-on workshop demonstration with letermovir.

pk_table(
   sim_data_file = "letermovir MD.xlsx", 
   existing_exp_details = Details) %>%
   formatTable_Simcyp()

If you save the output to a Word file, which is what we recommend, you will automatically get a table heading and caption, which will include information about which compound, tissue, and simulation was used to obtain the data, so the table will only include the columns “Statistic”, “AUCtau” and “Cmax”.

pk_table example 2: Adjust which summary stats are included and how they’re formatted

The default setting is to list geometric means and CVs, but you can see arithmetic instead by specifying that with “mean_type”. Additionally, let’s see the range of trial means (includeTrialMeans = TRUE) but not the CVs (includeCV = FALSE).

pk_table(
   sim_data_file = "letermovir MD.xlsx",
   mean_type = "arithmetic",
   includeTrialMeans = TRUE,
   includeCV = FALSE, 
   existing_exp_details = Details, 
   checkDataSource = FALSE) %>% formatTable_Simcyp()

pk_table example 3: Create a table of standard PK parameters for a DDI simulation

If a perpetrator molecule was included, the table will automatically pull parameters sensible for that scenario. This time, let’s additionally ask to check the data source to check where in the simulation Excel results file the data are from. First, here’s the table:

MyTable <- pk_table(
   sim_data_file = "QD MDZ QD RIF.xlsx", 
   existing_exp_details = Details, 
   checkDataSource = TRUE)

MyTable$Table %>% formatTable_Simcyp()

Now, to see where those parameters came from, let’s look at the other item that got included in our output when we said “checkDataSource = TRUE”.

MyTable$QC %>% formatTable_Simcyp(fontsize = 8)

This is a data.frame listing the files, tabs, columns, and rows where the data were found for the purposes of QCing.

For more options and examples, please see the help files for pk_table and see “PK-tables.docx”.

Return to TOC

Compare the results of two simulations

If you are interested in comparing the results from two simulations, e.g., comparing fasted vs. fed or healthy subjects vs. subjects with hepatic impairment, please see the instruction file Calculating-PK-ratios.docx

This section under construction. We’ll add examples here soon.

Return to TOC

Forest plots

There are several options for setting up forest plots; here is one example using example data included in the package: BufForestData_20mg.

forest_plot(forest_dataframe = BufForestData_20mg,
            y_axis_labels = Inhibitor1,
            y_axis_title = "Perpetrator",
            graph_title = "Effects of various DDI perpetrator\ndrugs on bufuralol PK", 
            show_numbers_on_right = TRUE, 
            rel_widths = c(3, 1), 
            save_graph = "Bufuralol forest plot.png", 
            fig_height = 6, fig_width = 7.25)

For more information, please see the instruction file Examples-for-making-forest-plots.docx

Return to TOC

Checking simulation fidelity with simulated vs. observed graphs

One quick way to check how well your model is capturing observed data is to visually compare simulated PK values to observed. The function so_graph will use input from making PK tables (from the function pk_table) to create these graphs, including making Guest-style graphs when the PK parameter in question is a ratio for a drug-drug interaction.

so_graph(PKtable = SOdata, 
         axis_title_y = "Predicted", 
         axis_title_x = "Observed")

For more information, please see the instruction file Examples-for-simulated-vs-observed-graphs.docx

Return to TOC

Fit induction data

The SimcypConsultancy package function inductFit can fit four possible induction models to in vitro induction data:

Equation 1. Emax model

$$ \text{fold induction} = 1 + \frac{E_{max} \times I}{EC_{50} + I} $$

where I is the concentration of the inducer, Emax is the maximum change in fold induction, and EC50 is the concentration of the inducer that elicits half the maximum fold-change in induction.

Use model = "Emax" in the inductFit call to fit this model to your data.

Equation 2. Emax slope model

$$ \text{fold induction} = 1 + E_{max} \times \frac{I^\gamma}{EC_{50}^\gamma + I^\gamma} $$

where I is the concentration of the inducer, Emax is the maximum change in fold induction, EC50 is the concentration of the inducer that elicits half the maximum fold-change in induction, and gamma is the Hill equation coefficient describing the slope.

Syntax for function: model = "EmaxSlope"

Equation 3. Slope model

$$ \text{fold induction} = 1 + I \times n $$

where I is the concentration of the inducer and n is the slope.

Syntax for function: model = "slope"

Equation 4. Sigmoidal three-parameter model

$$ \text{fold induction} = \frac{Ind_{max}}{1 + e^{ \frac{IndC_{50}-I}{\gamma}}} $$

where I is the concentration of the inducer, Indmax is the maximum fold induction, IndC50 is the concentration of the inducer that elicits half the maximum fold induction, and gamma is the Hill equation coefficient describing the slope.

Syntax for function: model = "Sig3Param"

Important note: The sigmoidal three-parameter model is the ONLY model that determines Indmax rather than Emax. The Simcyp Simulator uses Indmax, so please note that Indmax = Emax + 1

inductFit example 1: Fit 4 models to induction data using the default settings

Let’s fit some induction data to each of 4 possible models: Indmax, Indmax slope, slope, or sigmoid 3 parameter. For details on what these are, please see the instructions fitting induction data.

For now, we’ll call on some pre-existing dummy data we made up for the SimcypConsultancy package called IndData.

We’ll make an object “MyIndFits” to store the output and then look at that output one piece at a time.

MyIndFits <- inductFit(IndData, 
                       conc_column = Concentration_uM, 
                       fold_change_column = FoldInduction, 
                       model = "all", 
                       include_fit_stats = FALSE, 
                       rounding = "significant 3")

Now that we have run the function, let’s see the graphs.

MyIndFits$Graph

You can adjust the look of the graphs; please see the help file or the example Word document for examples.

Let’s see the fitted parameters:

MyIndFits$Fit %>% formatTable_Simcyp()

Return to TOC

Compare demographics

If you’re interested in seeing how the demographics of a population vary – either for comparing multiple simulated populations or a simulated population to an observed population – there are a couple of data visualization tools in the package to help you do that. We’ll focus here on the one for comparing multiple simulated populations, and we’ll compare healthy subjects to subjects with varying degrees of liver disease.

First, we’ll extract demographic data from our simulations. All of these included the “Demographic Data” tab in the results, which is where R is looking for demographic information. If you don’t include that in your results, this will not work.

Demog <- extractDemog(sim_data_files = c("mdz-5mg-sd-hv.xlsx", 
                                         "mdz-5mg-sd-cpa.xlsx", 
                                         "mdz-5mg-sd-cpb.xlsx", 
                                         "mdz-5mg-sd-cpc.xlsx"), 
                      demog_dataframe = Demog)

Here’s an example of the kind of visual comparisons we can make, focusing on just two parameters: a boxplot comparing liver weights and a scatter plot of height vs. overall body weight.

demog_plot_sim(demog_dataframe = Demog, 
               colorBy_column = File, 
               color_labels = c("mdz-5mg-sd-hv.xlsx" = "healthy", 
                                "mdz-5mg-sd-cpa.xlsx" = "Child-Pugh A",
                                "mdz-5mg-sd-cpb.xlsx" = "Child-Pugh B",
                                "mdz-5mg-sd-cpc.xlsx" = "Child-Pugh C"), 
               color_set = "greens", 
               demog_parameters = c("LiverWt_g", 
                                    "Height vs weight"), 
               variability_display = "boxplot", 
               ncol = 1, 
               save_graph = "Demographics - hepatic impairment.png", 
               fig_height = 6, fig_width = 5)

Note: We’re working on the legend but having some trouble with it repeating things at the moment.

For more information and examples, please see Examples-for-checking-subject-demographics.docx

Return to TOC

Create a table of model inputs

Under construction. See the help file for make_Simcyp_inputs_table.

Return to TOC

Trial-means plots

Under construction. See the help file for trial_means_plot and see the instruction file “Trial-means plots.docx”.

Example:

ObservedCmax <- data.frame(PKparameter = "Cmax_dose1", 
                           ObsValue = c(20, 22.5, 16), 
                           ObsVariability = c("10 to 29", "18-25", "12-30"), 
                           Study = c("Kolev et al., 2025", 
                                     "Dinh et al., 2023", 
                                     "Thakur et al., 2020"))

trial_means_plot(sim_data_file = "mdz-5mg-sd.xlsx", 
                 color_option = "by study", 
                 color_set = "viridis", 
                 point_shape = 16, 
                 y_axis_limits_lin = c(0, 60), 
                 observed_PK = ObservedCmax,
                 legend_position = "bottom",
                 save_graph = "Cmax trial means plot.png", 
                 fig_height = 5, fig_width = 6)

Return to TOC

Make a simulation directory

Under construction. See the help file for make_simulation_directory.

Example:

make_simulation_directory(existing_exp_details = Details, 
                          save_table = "simulation directory.xlsx")

Return to TOC

Automatically extract and graph sensitivity analysis data

sensitivity_plot example 1: Make an SA graph showing the effect of your independent variable on Cmax

We’ll supply the Excel file with the SA data to the argument “SA_file”, we’ll tell R which dependent variable we want to see (Cmax), give the graph a title (the “\n” bit is how we can insert a manual carriage return), and then specify whether to save the graph by supplying a file name to “save_graph”.

sensitivity_plot(SA_file = "SA on fa - mdz 5mg sd.xlsx",
                 dependent_variable = "Cmax", 
                 graph_title = "Sensitivity analysis:\nfa effect on Cmax", 
                 save_graph = "SA graph.png")

Other dependent variables are also possible; please see the help file for all options.

sensitivity_plot example 2: Make an SA graph with CL instead and also set the x-axis title manually

Here’s the same sensitivity analysis file but looking at oral clearance and setting something manually for the independent variable label on the x axis.

sensitivity_plot(SA_file = "SA on fa - mdz 5mg sd.xlsx",
                 dependent_variable = "CLpo", 
                 ind_var_label = "fa for the substrate")

sensitivity_plot example 3: Make an SA graph of plasma concentrations

And here is an example of plotting plasma concentrations from a sensitivity-analysis file:

sensitivity_plot(SA_file = "SA on fa - mdz 5mg sd.xlsx",
                 dependent_variable = "plasma", 
                 linear_or_log = "both vertical", 
                 time_range = c(0, 12))

Please see the help file for more information by typing ?sensitivity_plot into the console.

Return to TOC

A list of what SimcypConsultancy functions do

Below are descriptions of all the functions in the package, and the functions most users will want to use most frequently are in bold and listed first in each section.

Get help on using the SimcypConsultancy package

Get information about your simulations

PK tables and PK calculations

Make typical PK tables

Make PK comparisons not natively available in the Simulator outputs

Helper functions for PK tables

Calculate PK

Other tables you can make

Concentration-time plots

Make concentration-time plots or variations thereof

Helper functions for concentration-time plots

Make observed concentration-time plots

Observed concentration-time plot helper functions

Forest plots

Fit induction models to in vitro data

Evaluating model fidelity

Examining variability between trials

Comparing demographics

Other data visualizations that can help you analyze your data

Helper functions for graphing

Interact with the Simulator or with workspaces

Read or change workspaces

NB: Most functions that actually change workspaces are only available in the beta version of the package.

Helper functions for manipulating typical SimcypConsultancy objects

Miscellany

Formatting and saving tables

Helper functions that call on the Simcyp package

Getting data from a pdf table

Examining DDIs

Tidbits for making quick calculations

Internal functions you’ll probably never need to use but listed just in case you’ve seen them and are curious about what they do

Final notes

If you encounter a bug, if you would like help using any of these functions, or if there’s an additional feature you would like to have, please contact a member of the R Working Group.



shirewoman2/Consultancy documentation built on Feb. 18, 2025, 10 p.m.