ct_plot_mult: Make graphs for multiple Simulator output files at once

View source: R/ct_plot_mult.R

ct_plot_multR Documentation

Make graphs for multiple Simulator output files at once

Description

This function was designed for making nicely arranged concentration-time graphs from several Simcyp Simulator output files all together or for making multiple files – one for each Simulator file – all at once. Behind the scenes, it uses the function ct_plot to make these graphs, so it will automatically break up your supplied concentration-time data into datasets by a) file, b) compound ID, c) tissue, and d) tissue subtype for any ADAM- or brain-model data.

If you have more than one dataset per file, the graph titles won't necessarily be clear, so please pay attention to what data you're including. If you get unexpected or unclear output, try using ct_plot_overlay to graph your data; it might work better for what you want to show. For detailed instructions and examples, please see the SharePoint file "Simcyp PBPKConsult R Files - Simcyp PBPKConsult R Files/SimcypConsultancy function examples and instructions/Concentration-time plots 2 - multiple plots at once/Concentration-time-plot-examples-2.docx". (Sorry, we are unable to include a link to it here.)

A note on the order of the graphs: This function arranges graphs first by file, then by compound ID, then by tissue, and then by tissue subtype, and all sorting is alphabetical. However, since sorting alphabetically might not be the optimal graph arrangement for your scenario, you can specify the order of the graphs using either the graph_titles argument or, if you're comfortable with setting factors in R, by making any of File, CompoundID, Tissue, and Tissue_subtype factor rather than character data and setting the levels how you wish. If you're unfamiliar with setting factor levels in R and setting graph_titles isn't achieving what you want, please ask a member of the R Working Group for assistance.

Usage

ct_plot_mult(
  ct_dataframe,
  obs_to_sim_assignment = NA,
  graph_arrangement = "all together",
  qc_graph = FALSE,
  existing_exp_details = NA,
  figure_type = "percentiles",
  mean_type = "arithmetic",
  linear_or_log = "semi-log",
  time_range = NA,
  x_axis_interval = NA,
  x_axis_label = NA,
  pad_x_axis = TRUE,
  pad_y_axis = TRUE,
  y_axis_limits_lin = NA,
  y_axis_limits_log = NA,
  y_axis_label = NA,
  conc_units_to_use = NA,
  hline_position = NA,
  hline_style = "red dotted",
  vline_position = NA,
  vline_style = "red dotted",
  legend_position = "none",
  legend_orientation = NA,
  legend_label = NA,
  graph_titles = "none",
  graph_title_size = 14,
  graph_labels = TRUE,
  prettify_compound_names = TRUE,
  name_clinical_study = NA,
  report_progress = FALSE,
  save_graph = NA,
  file_suffix = NA,
  fig_height = 8,
  fig_width = 8,
  ...
)

Arguments

ct_dataframe

the data.frame with multiple sets of concentration-time data

obs_to_sim_assignment

optionally specify which observed files should be compared to which simulator files. If left as NA and what you supplied for ct_dataframe doesn't already specify which observed data go with which simulated file, this will assume that all observed data goes with all simulated data. To specify, use a named character vector like this: obs_to_sim_assignment = c("obs data 1.xlsx" = "mdz-5mg-qd.xlsx", "obs data 2.xlsx" = "mdz-5mg-qd-cancer.xlsx") If one observed file needs to match more than one simulated file but not all the simulated files, you can do that by separating the simulated files with commas, e.g., obs_to_sim_assignment = c("obs data 1.xlsx" = "mdz-5mg-qd.xlsx, mdz-5mg-qd-fa08.xlsx", "obs data 2.xlsx" = "mdz-5mg-qd-cancer.xlsx, mdz-5mg-qd-cancer-fa08.xlsx"). Pay close attention to the position of commas and quotes there!

graph_arrangement

set how to arrange the graphs. Options are

"all together"

(default) for all graphs being nicely arranged and aligned together,

"separate files"

to make one output file per simulator file, or

"numrows x numcols"

where you replace "numrows" with the number of rows of graphs you'd like and "numcols" with the number of columns. The result will be a single, nicely arranged graph. For example, "2 x 4" will make a set of graphs with 2 rows and 4 columns. This is the same as the option "all together" except with more control.

If you choose "separate files", each Simulator output file will have its own graph file, named to match the Simulator output file name and you don't need to specify anything for save_graph. (In fact, anything you specify for save_graph will be ignored.)

qc_graph

TRUE or FALSE (default) on whether to create a second copy of the graphical file(s) where the left panel shows the original graphs and the right panel shows information about the file used to get the data and the trial design. This works MUCH faster when you have already used extractExpDetails_mult to get information about how your simulation or simulations were set up and supply that object to the argument existing_exp_details.

existing_exp_details

output from extractExpDetails or extractExpDetails_mult to be used with qc_graph

figure_type

type of figure to plot. Options are:

"trial means"

plots an opaque line for the mean data, lighter lines for the mean of each trial of simulated data, and open circles for the observed data. If a perpetrator were present, lighter dashed lines indicate the mean of each trial of simulated data in the presence of the perpetrator.

"percentiles"

(default) plots an opaque line for the mean data, lighter lines for the 5th and 95th percentiles of the simulated data, and open circles for the observed data. If an effecter were present, the default is dashed lines for the data in the presence of a perpetrator.

"percentile ribbon"

plots an opaque line for the mean data, transparent shading for the 5th to 95th percentiles of the simulated data, and open circles for the observed data. If a perpetrator were present, the default is to show the data without the perpetrator in blue and the data in the presence of the perpetrator in red. NOTE: There is a known bug within RStudio that can cause filled semi-transparent areas like you get with the "percentile ribbon" figure type to NOT get graphed for certain versions of RStudio. To get around this, within RStudio, go to Tools –> Global Options –> General –> Graphics –> And then set "Graphics device: backend" to "AGG". Honestly, this is a better option for higher-quality graphics anyway!

"means only"

plots a black line for the mean data and, if an perpetrator was modeled, a dashed line for the concentration-time data with Inhibitor 1.

"Freddy"

Freddy's favorite style of plot with trial means in light gray, the overall mean in thicker black, the 5th and 95th percentiles in dashed lines, and the observed data in semi-transparent purple-blue points. Graphs with a perpetrator present lose the trial means, and the percentiles switch to solid, gray lines.

mean_type

graph "arithmetic" (default) or "geometric" means or "median" for median concentrations

linear_or_log

the type of graph to be returned. Options: "semi-log" (default), "linear", "both vertical" (graphs are stacked vertically), or "both horizontal" (graphs are side by side).

time_range

time range to show relative to the start of the simulation. Options:

NA

(default) entire time range of data

a start time and end time in hours

only data in that time range, e.g. c(24, 48). Note that there are no quotes around numeric data.

"first dose"

only the time range of the first dose

"last dose"

only the time range of the last dose

"penultimate dose"

only the time range of the 2nd-to-last dose, which can be useful for BID data where the end of the simulation extended past the dosing interval or data when the substrate was dosed BID and the perpetrator was dosed QD

a specific dose number with "dose" or "doses" as the prefix

the time range encompassing the requested doses, e.g., time_range = "dose 3" for the 3rd dose or time_range = "doses 1 to 4" for doses 1 to 4

"all obs" or "all observed" if you feel like spelling it out

Time range will be limited to only times when observed data are present.

"last dose to last observed" or "last obs" for short

Time range will be limited to the start of the last dose until the last observed data point.

"last dose to end" or "last to end" for short

Time range will be limited to the start of the last dose until the end of the simulation.

x_axis_interval

optionally set the x-axis major tick-mark interval. Acceptable input: any number or leave as NA to accept default values, which are generally reasonable guesses as to aesthetically pleasing and PK-relevant intervals.

x_axis_label

optionally supply a character vector or an expression to use for the x axis label

pad_x_axis

optionally add a smidge of padding to the the x axis (default is TRUE, which includes some generally reasonable padding). If changed to FALSE, the y axis will be placed right at the beginning of your time range and all data will end exactly at the end of the time range specified. If you want a specific amount of x-axis padding, set this to a number; the default is c(0.02, 0.04), which adds 2% more space to the left side and 4% more to the right side of the x axis. If you only specify one number, we'll assume that's the percent you want added to the left side.

pad_y_axis

optionally add a smidge of padding to the y axis (default is TRUE, which includes some generally reasonable padding). As with pad_x_axis, if changed to FALSE, the x axis will be placed right at the bottom of your data, possibly cutting a point in half. If you want a specific amount of y-axis padding, set this to a number; the default is c(0.02, 0), which adds 2% more space to the bottom and nothing to the top of the y axis. If you only specify one number, we'll assume that's the percent you want added to the bottom.

y_axis_limits_lin

optionally set the Y axis limits for the linear plot, e.g., c(10, 1000). If left as the default NA, the Y axis limits for the linear plot will be automatically selected.

y_axis_limits_log

optionally set the Y axis limits for the semi-log plot, e.g., c(10, 1000). Values will be rounded down and up, respectively, to a round number. If left as the default NA, the Y axis limits for the semi-log plot will be automatically selected.

y_axis_label

optionally supply a character vector or an expression to use for the y axis label

conc_units_to_use

concentration units to use for graphs. If left as NA, the concentration units in the source data will be used. Acceptable options are "mg/L", "mg/mL", "µg/L" (or "ug/L"), "µg/mL" (or "ug/mL"), "ng/L", "ng/mL", "µM" (or "uM"), or "nM". If you want to use a molar concentration and your source data were in mass per volume units or vice versa, you'll need to provide something for the argument existing_exp_details.

hline_position

numerical position(s) of any horizontal lines to add to the graph. The default is NA to have no lines, and good syntax if you do want lines would be, for example, hline_position = 10 to have a horizontal line at 10 ng/mL (or whatever your concentration units are) or hline_position = c(10, 100, 1000) to have horizontal lines at each of those y values. Examples of where this might be useful would be to indicate a toxicity threshold, a target Cmin, or the lower limit of quantification for the assay used to generate the concentration-time data.

hline_style

the line color and type to use for any horizontal lines that you add to the graph with hline_position. Default is "red dotted", but any combination of 1) a color in R and 2) a named linetype is acceptable. Examples: "red dotted", "blue dashed", or "#FFBE33 longdash". To see all the possible linetypes, type ggpubr::show_line_types() into the console.

vline_position

numerical position(s) of any vertical lines to add to the graph. The default is NA to have no lines, and good syntax if you do want lines would be, for example, vline_position = 12 to have a vertical line at 12 h or vline_position = seq(from = 0, to = 168, by = 24) to have horizontal lines every 24 hours for one week. Examples of where this might be useful would be indicating dosing times or the time at which some other drug was started or stopped.

vline_style

the line color and type to use for any vertical lines that you add to the graph with vline_position. Default is "red dotted", but any combination of 1) a color in R and 2) a named linetype is acceptable. Examples: "red dotted", "blue dashed", or "#FFBE33 longdash". To see all the possible linetypes, type ggpubr::show_line_types() into the console.

legend_position

Specify where you want the legend to be. Options are "left", "right", "bottom", "top", or "none" (default) if you don't want one at all. If you include the legend but then some graphs do have a legend and some graphs do not (e.g., some have perpetrators and some do not so there's nothing to put in a legend), the alignment between sets of graphs will be a bit off.

legend_orientation

optionally specify how the legend entries should be oriented. Options are "vertical" or "horizontal", and, if left as NA, the legend entries will be "vertical" when the legend is on the left or right and "horizontal" when it's on the top or bottom.

legend_label

optionally indicate on the legend whether the perpetrator is an inhibitor, inducer, activator, or suppressor. Input will be used as the label in the legend for the line style and the shape. If left as the default NA when a legend is included and a perpetrator is present, the label in the legend will be "Inhibitor".

graph_titles

optionally specify titles to be used in the graphs and specify the order in which the files are graphed or use "none" (default) to have no titles on your graphs. Input should be a named character vector of the files in the order you would like and what you want to use for the title. The file name must perfectly match the file name listed in ct_dataframe or it won't be used. An example of how this might be specified: graph_titles = c("My file 1.xlsx" = "Healthy volunteers", "My file 2.xlsx" = "Mild hepatic impairment") If you get an order that you didn't think you specified, please double check that you have specified the file names exactly as they appear in ct_dataframe. CAVEAT: If you have more than one dataset per file, this is trickier. However, you can specify titles using the name of the simulator output file, the compound ID, the tissue, and then the ADAM-model subsection (use "none" if that doesn't apply here), each separated with a ".". An example: graph_titles = c("my sim file.xlsx.substrate.plasma.none" = "Midazolam", "my sim file.xlsx.inhibitor 1.plasma.none" = "Ketoconazole") Please see the "Examples" section for an example with the dataset MDZ_Keto.

graph_title_size

the font size for the graph title if it's included; default is 14. This also determines the font size of the graph labels.

graph_labels

TRUE (default) or FALSE for whether to include labels (A, B, C, etc.) for each of the small graphs.

prettify_compound_names

TRUE (default), FALSE or a character vector: This is asking whether to make compound names prettier in legend entries and in any Word output files. This was designed for simulations where the substrate and any metabolites, perpetrators, or perpetrator metabolites are among the standard options for the simulator, and leaving prettify_compound_names = TRUE will make the name of those compounds something more human readable. For example, "SV-Rifampicin-MD" will become "rifampicin", and "Sim-Midazolam" will become "midazolam". Setting this to FALSE will leave the compound names as is. For an approach with more control over what the compound names will look like in legends and Word output, set each compound to the exact name you want with a named character vector. For example, prettify_compound_names = c("Sim-Ketoconazole-400 mg QD" = "ketoconazole", "Wks-Drug ABC-low_ka" = "Drug ABC") will make those compounds "ketoconazole" and "Drug ABC" in a legend.

name_clinical_study

optionally specify the name(s) of the clinical study or studies for any observed data. This only affects the caption of the graph. For example, specifying name_clinical_study = "101, fed cohort" will result in a figure caption that reads in part "clinical study 101, fed cohort". If you have more than one study, that's fine; we'll take care of stringing them together appropriately. Just list them as a character vector, e.g., name_clinical_study = c("101", "102", "103") will become "clinical studies 101, 102, and 103."

report_progress

TRUE or FALSE (default) for whether show a progress message on creating and saving graphs

save_graph

optionally save the output graph by supplying a file name in quotes here, e.g., "My conc time graph.png"or "My conc time graph.docx". If you leave off ".png" or ".docx", it will be saved as a png file, but if you specify a different graphical file extension, it will be saved as that file format. Acceptable graphical file extensions are "eps", "ps", "jpeg", "jpg", "tiff", "png", "bmp", or "svg". Do not include any slashes, dollar signs, or periods in the file name. Leaving this as NA means the file will not be automatically saved to disk, except when graph_arrangment = "separate files", when anything you specify here will be ignored and it will be saved by file name anyway.

file_suffix

optionally add a file suffix to explain what each graph it. For example, you might run this function once and with figure_type = "means only" and once with figure_type = "percentiles", so you could set file_suffix to, e.g., "means only" for the former and "percentiles" for the latter, and the graph file names would be something like "abc-5mg-sd - means only.png" and "abc-5mg-sd - percentiles.png". This is only used when graph_arrangement = "separate files".

fig_height

figure height in inches; default is 8

fig_width

figure width in inches; default is 8

...

arguments that pass through to ct_plot

Value

a set of arranged ggplot2 graphs and/or saved files of those graphs

Examples


data(MDZct)
ct_plot_mult(ct_dataframe = MDZct)

ct_plot_mult(ct_dataframe = MDZct,
   graph_titles = c("mdz-5mg-sd-fa1.xlsx" = "fa = 1",
                    "mdz-5mg-sd-fa0_8.xlsx" = "fa = 0.8",
                    "mdz-5mg-sd-fa0_6.xlsx" = "fa = 0.6",
                    "mdz-5mg-sd-fa0_4.xlsx" = "fa = 0.4"))


# Graph titles when you have the tricky situation of more than one
# dataset per file
ct_plot_mult(
    ct_dataframe = MDZ_Keto,
    graph_titles = c("mdz-qd-keto-qd.xlsx.substrate.plasma.none" = "Midazolam in plasma",
                     "mdz-qd-keto-qd.xlsx.substrate.blood.none" = "Midazolam in blood",
                     "mdz-qd-keto-qd.xlsx.inhibitor 1.plasma.none" = "Ketoconazole in plasma",
                     "mdz-qd-keto-qd.xlsx.inhibitor 1.blood.none" = "Ketoconazole in blood"))


shirewoman2/Consultancy documentation built on Feb. 18, 2025, 10 p.m.