R/enz_plot_overlay.R

Defines functions enz_plot_overlay

Documented in enz_plot_overlay

#' Overlay multiple data sets onto a single enzyme-abundance graph
#'
#' \code{enz_plot_overlay} is meant to be used in conjunction with
#' \code{\link{extractConcTime_mult}} to create single graphs with overlaid
#' enzyme-abundance data for multiple tissues, enzymes, or Simcyp Simulator
#' output files for easy comparisons.
#'
#' @param sim_enz_dataframe the input enzyme-abundance data generated by running
#'   the function \code{\link{extractConcTime_mult}} or
#'   \code{\link{extractEnzAbund}}. Not quoted.
#' @param mean_type plot "arithmetic" (default) or "geometric" mean
#'   concentrations or "median" concentrations as the main (thickest or only)
#'   line for each data set. If this aggregate measure is not available in the
#'   simulator output, you'll receive a warning message and we'll plot one that
#'   \emph{is} available.
#' @param figure_type the type of figure to plot. \describe{
#'
#'   \item{"means only"}{(default) show only the mean, geometric mean, or median
#'   (whatever you chose for "mean_type")}
#'
#'   \item{"percentiles"}{plots an opaque line for the mean data and lighter
#'   lines for the 5th and 95th percentiles of the simulated data}
#'
#'   \item{"percentile ribbon"}{show an opaque line for the mean data and
#'   transparent shading for the 5th to 95th percentiles. \strong{NOTE: There is
#'   a known bug within RStudio that can cause filled semi-transparent areas
#'   like you get with the "percentile ribbon" figure type to NOT get graphed
#'   for certain versions of RStudio.} To get around this, within RStudio, go to
#'   Tools --> Global Options --> General --> Graphics --> And then set
#'   "Graphics device: backend" to "AGG". Honestly, this is a better option for
#'   higher-quality graphics anyway!}
#'
#'   \item{"trial means"}{plots an opaque line for the mean data, lighter lines
#'   for the mean of each trial of simulated data, and open circles for the
#'   observed data. If a perpetrator were present, lighter dashed lines indicate
#'   the mean of each trial of simulated data in the presence of the perpetrator.}}
#'
#' @param linear_or_log the type of graph to be returned. Options: \describe{
#'   \item{"semi-log"}{y axis is log transformed}
#'
#'   \item{"linear"}{no axis transformation; this is the default}
#'
#'   \item{"both vertical"}{both the linear and the semi-log graphs will be
#'   returned, and graphs are stacked vertically}
#'
#'   \item{"both horizontal"}{both the linear and the semi-log graphs will be
#'   returned, and graphs are stacked horizontally}}
#' @param colorBy_column (optional) the column in \code{sim_enz_dataframe} that
#'   should be used for determining which color lines and/or points will be.
#'   This should be unquoted, e.g., \code{colorBy_column = Tissue}.
#' @param color_labels optionally specify a character vector for how you'd like
#'   the labels for whatever you choose for \code{colorBy_column} to show up in
#'   the legend. For example, use \code{c("file 1.xlsx" = "fa 0.5", "file
#'   2.xlsx" = "fa 0.2")} to indicate that "file 1.xlsx" is for an fa of 0.5 and
#'   "file 2.xlsx" is for an fa of 0.2. The order in the legend will match the
#'   order designated here.
#' @param legend_label_color optionally indicate on the legend something
#'   explanatory about what the colors represent. For example, if
#'   \code{colorBy_column = File} and \code{legend_label_color = "Simulations
#'   with various fa values"}, that will make the label above the file names in
#'   the legend more explanatory than just "File". The default is to use
#'   whatever the column name is for \code{colorBy_column}. If you don't want a
#'   label for this legend item, set this to "none".
#' @param color_set the set of colors to use. Options: \describe{
#'
#'   \item{"default"}{a set of colors from Cynthia Brewer et al. from Penn State
#'   that are friendly to those with red-green colorblindness. The first three
#'   colors are green, orange, and purple. This can also be referred to as
#'   "Brewer set 2". If there are only two unique values in the colorBy_column,
#'   then Brewer set 1 will be used since red and blue are still easily
#'   distinguishable but also more aesthetically pleasing than green and
#'   orange.}
#'
#'   \item{"Brewer set 1"}{colors selected from the Brewer palette "set 1". The
#'   first three colors are red, blue, and green.}
#'
#'   \item{"ggplot2 default"}{the default set of colors used in ggplot2 graphs
#'   (ggplot2 is an R package for graphing.)}
#'
#'   \item{"rainbow"}{colors selected from a rainbow palette. The default
#'   palette is limited to something like 6 colors, so if you have more than
#'   that, that's when this palette is most useful. It's \emph{not} very useful
#'   when you only need a couple of colors.}
#'
#'   \item{"blue-green"}{a set of blues fading into greens. This palette can be
#'   especially useful if you are comparing a systematic change in some
#'   continuous variable -- for example, increasing dose or predicting how a
#'   change in intrinsic solubility will affect concentration-time profiles --
#'   because the direction of the trend will be clear.}
#'
#'   \item{"blues"}{a set of blues fading from sky to navy. Like
#'   "blue-green", this palette can be especially useful if you are comparing a
#'   systematic change in some continuous variable.}
#'
#'   \item{"greens"}{a set of greens fading from chartreuse to forest. Like
#'   "blue-green", this palette can be especially useful if you are comparing a
#'   systematic change in some continuous variable.}
#'
#'   \item{"purples"}{a set of purples fading from lavender to aubergine. Like
#'   "blue-green", this palette can be especially useful if you are comparing a
#'   systematic change in some continuous variable.}
#'
#'   \item{"Tableau"}{uses the standard Tableau palette; requires the "ggthemes"
#'   package}
#'
#'   \item{"viridis"}{from the eponymous package by Simon Garnier and ranges
#'   colors from purple to blue to green to yellow in a manner that is
#'   "printer-friendly, perceptually uniform and easy to read by those with
#'   colorblindness", according to the package author}
#'
#'   \item{a character vector of colors}{If you'd prefer to set all the colors
#'   yourself to \emph{exactly} the colors you want, you can specify those
#'   colors here. An example of how the syntax should look: \code{color_set =
#'   c("dodgerblue3", "purple", "#D8212D")} or, if you want to specify exactly
#'   which item in \code{colorBy_column} gets which color, you can supply a
#'   named vector. For example, if you're coloring the lines by the compound ID,
#'   you could do this: \code{color_set = c("substrate" = "dodgerblue3",
#'   "inhibitor 1" = "purple", "primary metabolite 1" = "#D8212D")}. If you'd
#'   like help creating a specific gradation of colors, please talk to a member
#'   of the R Working Group about how to do that using
#'   \link{colorRampPalette}.}}
#'
#' @param linetype_column the column in \code{sim_enz_dataframe} that should be
#'   used for determining the line types. For example, if \code{linetype_column}
#'   is set to \code{Inhibitor}, then the default is to show a solid line for no
#'   inhibitor being present and then a dashed line when the inhibitor \emph{is}
#'   present. You can set which types of lines to use with the argument
#'   \code{linetypes} and you can set which shapes of points you want with the
#'   argument \code{obs_shape}.
#' @param linetypes the line types to use. Default is "solid" for all lines.
#'   You'll need one line type for each possible value in the column you
#'   specified for \code{linetype_column}. If you get a graph you didn't expect
#'   as far as line types go, try checking what all the possible values are for
#'   the column you specified for \code{linetype_column}. You can do this by
#'   checking, e.g., \code{unique(CT$Inhibitor)} if your sim_enz_dataframe was
#'   named "CT" and the column you set for \code{linetype_column} was
#'   "Inhibitor". To see possible line types by name, please enter
#'   \code{ggpubr::show_line_types()} into the console.
#' @param line_width optionally specify how thick to make the lines. Acceptable
#'   input is a number; the default is 1 for most lines and 0.8 for some, to
#'   give you an idea of where to start.
#' @param legend_label_linetype optionally indicate on the legend something
#'   explanatory about what the line types represent. For example, if
#'   \code{linetype_column = Inhibitor} and \code{legend_label_linetype =
#'   "Inhibitor present"}, that will make the label in the legend above, e.g.,
#'   "none", and whatever perpetrator was present more explanatory than just
#'   "Inhibitor". The default is to use whatever the column name is for
#'   \code{linetype_column}. If you don't want a label for this legend item, set
#'   this to "none".
#' @param facet1_column optionally break up the graph into small multiples; this
#'   specifies the first of up to two columns to break up the data by, and the
#'   designated column name should be unquoted, e.g., \code{facet1_column =
#'   Tissue}. If \code{floating_facet_scale} is FALSE and you haven't specified
#'   \code{facet_ncol} or  \code{facet_nrow}, then \code{facet1_column} will
#'   designate the rows of the output graphs.
#' @param facet2_column optionally break up the graph into small multiples; this
#'   specifies the second of up to two columns to break up the data by, and the
#'   designated column name should be unquoted, e.g., \code{facet2_column =
#'   CompoundID}. If \code{floating_facet_scale} is FALSE and you haven't
#'   specified \code{facet_ncol} or  \code{facet_nrow}, then
#'   \code{facet2_column} will designate the columns of the output graphs.
#' @param facet1_title optionally specify a title to describe facet 1. This is
#'   ignored if \code{floating_facet_scale} is TRUE or if you have specified
#'   \code{facet_ncol} or \code{facet_nrow}.
#' @param facet2_title optionally specify a title to describe facet 2. This is
#'   ignored if \code{floating_facet_scale} is TRUE or if you have specified
#'   \code{facet_ncol} or \code{facet_nrow}.
#' @param facet_ncol optionally specify the number of columns of facetted graphs
#'   you would like to have. This only applies when you have specified a column
#'   for \code{facet1_column} and/or \code{facet2_column}.
#' @param facet_nrow optionally specify the number of rows of facetted graphs
#'   you would like to have. This only applies when you have specified a column
#'   for \code{facet1_column} and/or \code{facet2_column}.
#' @param floating_facet_scale TRUE, FALSE (default), "x", "y", or "xy" for
#'   whether to allow the axes for each facet of a multi-facetted graph to scale
#'   freely to best fit whatever data are present. Default is FALSE, which means
#'   that all data will be on the same scale for easy comparison. However, this
#'   could mean that some graphs have lines that are hard to see, so you can set
#'   this to TRUE to allow the axes to shrink or expand according to what data
#'   are present for that facet. If this is set to "x", "y", or "xy", then the
#'   scale will only float along that axis. Play around with this to see what we
#'   mean.
#'
#'   Floating axes comes with a trade-off for the looks of the graphs, though:
#'   Setting this to TRUE does mean that your x axis won't automatically have
#'   pretty breaks that are sensible for times in hours and that you can't
#'   specify intervals or limits for either the x or the y axis.
#'
#'   If you're a ggplot2 user, here's what's going on under the hood: If you set
#'   \code{floating_facet_scale = FALSE}, the default, then ct_plot_overlay will
#'   use facet_grid to break up your graphs and set \code{facet1_column} to the
#'   rows and \code{facet2_column} to the columns. If you set
#'   \code{floating_facet_scale = TRUE}, then ct_plot_overlay will use
#'   facet_wrap to break up your data.
#' @param facet_spacing Optionally set the spacing between facets. If left as
#'   NA, a best-guess as to a reasonable amount of space will be used. Units are
#'   "lines", so try, e.g. \code{facet_spacing = 2}. (Reminder: Numeric data
#'   should not be in quotes.)
#' @param time_range time range to display. Options: \describe{
#'
#'   \item{NA}{entire time range of data; default}
#'
#'   \item{a start time and end time in hours}{only data in that time range,
#'   e.g. \code{c(24, 48)}. Note that there are no quotes around numeric data.}
#'
#'   \item{"first dose"}{only the time range of the first dose}
#'
#'   \item{"last dose"}{only the time range of the last dose}
#'
#'   \item{"penultimate dose"}{only the time range of the 2nd-to-last dose,
#'   which can be useful for BID data where the end of the simulation extended
#'   past the dosing interval or data when the substrate was dosed BID and the
#'   perpetrator was dosed QD}
#'
#'   \item{a specific dose number with "dose" or "doses" as the prefix}{the time
#'   range encompassing the requested doses, e.g., \code{time_range = "dose 3"}
#'   for the 3rd dose or \code{time_range = "doses 1 to 4"} for doses 1 to 4}
#'
#'   \item{"all obs" or "all observed" if you feel like spelling it out}{Time
#'   range will be limited to only times when observed data are present.}
#'
#'   \item{"last dose to last observed" or "last obs" for short}{Time range will
#'   be limited to the start of the last dose until the last observed data
#'   point.} }
#'
#' @param x_axis_interval set the x-axis major tick-mark interval. Acceptable
#'   input: any number or leave as NA to accept default values, which are
#'   generally reasonable guesses as to aesthetically pleasing and PK-relevant
#'   intervals.
#' @param x_axis_label optionally supply a character vector or an expression to
#'   use for the x axis label
#' @param pad_x_axis optionally add a smidge of padding to the x axis (default
#'   is TRUE, which includes some generally reasonable padding). If changed to
#'   FALSE, the y axis will be placed right at the beginning of your time range
#'   and all data will end \emph{exactly} at the end of the time range
#'   specified. If you want a \emph{specific} amount of x-axis padding, set this
#'   to a number; the default is \code{c(0.02, 0.04)}, which adds 2\% more space
#'   to the left side and 4\% more space to the right side of the x axis. If you
#'   only specify one number, padding is added to the left side.
#' @param pad_y_axis optionally add a smidge of padding to the y axis (default
#'   is TRUE, which includes some generally reasonable padding). As with
#'   \code{pad_x_axis}, if changed to FALSE, the x axis will be placed right at
#'   the bottom of your data, possibly cutting a point in half. If you want a
#'   \emph{specific} amount of y-axis padding, set this to a number; the default
#'   is \code{c(0.02, 0)}, which adds 2\% more space to the bottom and nothing
#'   to the top of the y axis. If you only specify one number, padding is added
#'   to the bottom.
#' @param y_axis_limits_lin Optionally set the Y axis limits for the linear
#'   plot, e.g., \code{c(10, 1000)}. If left as NA, the Y axis limits for the
#'   linear plot will be automatically selected. This only applies when you have
#'   requested a linear plot with \code{linear_or_log}.
#' @param y_axis_limits_log Optionally set the Y axis limits for the semi-log
#'   plot, e.g., \code{c(10, 1000)}. Values will be rounded down and up,
#'   respectively, to the nearest order of magnitude. If left as NA, the Y axis
#'   limits for the semi-log plot will be automatically selected. This only
#'   applies when you have requested a semi-log plot with \code{linear_or_log}.
#' @param y_axis_interval set the y-axis major tick-mark interval. Acceptable
#'   input: any number or leave as NA to accept default values, which are
#'   generally reasonable guesses as to aesthetically pleasing intervals.
#' @param y_axis_label optionally supply a character vector or an expression to
#'   use for the y axis label
#' @param hline_position numerical position(s) of any horizontal lines to add to
#'   the graph. The default is NA to have no lines, and good syntax if you
#'   \emph{do} want lines would be, for example, \code{hline_position = 100} to
#'   have a horizontal line at 100 percent of the baseline enzyme abundance or
#'   \code{hline_position = c(50, 100, 200)} to have horizontal lines at each of
#'   those y values.
#' @param hline_style the line color and type to use for any horizontal lines
#'   that you add to the graph with \code{hline_position}. Default is "red
#'   dotted", but any combination of 1) a color in R and 2) a named linetype is
#'   acceptable. Examples: "red dotted", "blue dashed", or "#FFBE33 longdash".
#'   To see all the possible linetypes, type \code{ggpubr::show_line_types()}
#'   into the console.
#' @param vline_position numerical position(s) of any vertical lines to add to
#'   the graph. The default is NA to have no lines, and good syntax if you
#'   \emph{do} want lines would be, for example, \code{vline_position = 12} to
#'   have a vertical line at 12 h or \code{vline_position = seq(from = 0, to =
#'   168, by = 24)} to have horizontal lines every 24 hours for one week.
#'   Examples of where this might be useful would be indicating dosing times or
#'   the time at which some other drug was started or stopped.
#' @param vline_style the line color and type to use for any vertical lines that
#'   you add to the graph with \code{vline_position}. Default is "red dotted",
#'   but any combination of 1) a color in R and 2) a named linetype is
#'   acceptable. Examples: "red dotted", "blue dashed", or "#FFBE33 longdash".
#'   To see all the possible linetypes, type \code{ggpubr::show_line_types()}
#'   into the console.
#' @param graph_labels TRUE or FALSE for whether to include labels (A, B, C,
#'   etc.) for each of the small graphs. (Not applicable if only outputting
#'   linear or only semi-log graphs.)
#' @param graph_title optionally specify a title that will be centered across
#'   your graph or set of graphs
#' @param graph_title_size the font size for the graph title if it's included;
#'   default is 14
#' @param legend_position Specify where you want the legend to be. Options are
#'   "left", "right" (default in most scenarios), "bottom", "top", or "none" if
#'   you don't want one at all.
#' @param prettify_compound_names TRUE (default), FALSE or a character vector:
#'   This is asking whether to make compound names prettier in legend entries
#'   and in any Word output files. This was designed for simulations where the
#'   substrate and any metabolites, perpetrators, or perpetrator metabolites are
#'   among the standard options for the simulator, and leaving
#'   \code{prettify_compound_names = TRUE} will make the name of those compounds
#'   something more human readable. For example, "SV-Rifampicin-MD" will become
#'   "rifampicin", and "Sim-Midazolam" will become "midazolam". Setting this to
#'   FALSE will leave the compound names as is. For an approach with more
#'   control over what the compound names will look like in legends and Word
#'   output, set each compound to the exact name you  want with a named
#'   character vector. For example, \code{prettify_compound_names =
#'   c("Sim-Ketoconazole-400 mg QD" = "ketoconazole", "Wks-Drug ABC-low_ka" =
#'   "Drug ABC")} will make those compounds "ketoconazole" and "Drug ABC"
#'   in a legend or in a figure caption.
#' @param assume_unique TRUE (default) or FALSE for whether to assume that the
#'   concentration-time data contain no replicates, which messes things up and
#'   will likely cause this function to crash. Why would you want to skip this?
#'   Because it can take a LOOOOOOONG time if you have a lot of points. If
#'   you're sure your data are unique, set this to TRUE and save a fair amount
#'   of processing time to make your graph. If you're not sure what we're
#'   talking about here or if you get error messages that aren't terribly clear
#'   (which generally means that R wrote them and not your friendly
#'   SimcypConsultancy package authors), try setting this to FALSE.
#' @param return_caption TRUE or FALSE (default) for whether to return any
#'   caption text to use with the graph. This works best if you supply something
#'   for the argument \code{existing_exp_details}. If set to TRUE, you'll get as
#'   output a list of the graph, the figure heading, and the figure caption.
#' @param save_graph optionally save the output graph by supplying a file name
#'   in quotes here, e.g., "My conc time graph.png"or "My conc time graph.docx".
#'   The nice thing about saving to Word is that the figure title and caption
#'   text will be partly filled in automatically, although you should check that
#'   the text makes sense in light of your exact graph. If you leave off ".png"
#'   or ".docx", it will be saved as a png file, but if you specify a different
#'   graphical file extension, it will be saved as that file format. Acceptable
#'   graphical file extensions are "eps", "ps", "jpeg", "jpg", "tiff", "png",
#'   "bmp", or "svg". Do not include any slashes, dollar signs, or periods in
#'   the file name. Leaving this as NA means the file will not be automatically
#'   saved to disk.
#' @param fig_height figure height in inches; default is 6
#' @param fig_width figure width in inches; default is 5
#'
#' @return a ggplot2 graph
#' @export
#'
#' @examples
#' enz_plot_overlay(sim_enz_dataframe = bind_rows(CYP3A4_gut, CYP3A4_liver),
#'                  colorBy_column = Tissue, linetype_column = Inhibitor)
#'
#'
#'
#' 
enz_plot_overlay <- function(sim_enz_dataframe,
                             mean_type = "arithmetic",
                             figure_type = "means only", 
                             linear_or_log = "linear",
                             colorBy_column,
                             color_labels = NA, 
                             legend_label_color = NA,
                             color_set = "default",
                             linetype_column,
                             linetypes = c("solid", "dashed"),
                             line_width = NA,
                             legend_label_linetype = NA,
                             facet1_column,
                             facet1_title = NA,
                             facet2_column, 
                             facet2_title = NA,
                             facet_ncol = NA, 
                             facet_nrow = NA,
                             floating_facet_scale = FALSE,
                             facet_spacing = NA,
                             time_range = NA, 
                             x_axis_interval = NA,
                             x_axis_label = NA,
                             pad_x_axis = TRUE,
                             pad_y_axis = TRUE,
                             y_axis_limits_lin = NA,
                             y_axis_limits_log = NA, 
                             y_axis_interval = NA,
                             y_axis_label = NA,
                             hline_position = NA, 
                             hline_style = "red dotted", 
                             vline_position = NA, 
                             vline_style = "red dotted",
                             graph_labels = TRUE,
                             graph_title = NA,
                             graph_title_size = 14, 
                             prettify_compound_names = TRUE,
                             legend_position = NA,
                             existing_exp_details = NA,
                             return_caption = FALSE, 
                             save_graph = NA,
                             fig_height = 6,
                             fig_width = 5, 
                             assume_unique = TRUE){
   
   # NB: This is a pass-through function for ct_plot_overlay. 
   
   # Error catching ---------------------------------------------------------
   # Check whether tidyverse is loaded
   if("package:tidyverse" %in% search() == FALSE){
      stop("The SimcypConsultancy R package requires the package tidyverse to be loaded, and it doesn't appear to be loaded yet. Please run\nlibrary(tidyverse)\n    ...and then try again.", 
           call. = FALSE)
   }
   
   # Main function ----------------------------------------------------------
   
   facet1_column <- rlang::enquo(facet1_column)
   facet2_column <- rlang::enquo(facet2_column)
   colorBy_column <- rlang::enquo(colorBy_column)
   linetype_column <- rlang::enquo(linetype_column)
   
   Out <- ct_plot_overlay(ct_dataframe = sim_enz_dataframe,
                          # NSE trouble: not enquo alone, not quo, not
                          # substitute, but enquo plus !! here
                          colorBy_column = !!colorBy_column,
                          linetype_column = !!linetype_column,
                          facet1_column = !!facet1_column,
                          facet1_title = facet1_title, 
                          facet2_column = !!facet2_column,
                          facet2_title = facet2_title, 
                          obs_to_sim_assignment = NA,
                          mean_type = mean_type,
                          figure_type = figure_type, 
                          linear_or_log = linear_or_log,
                          color_labels = color_labels, 
                          legend_label_color = legend_label_color,
                          color_set = color_set,
                          obs_shape = NA,
                          obs_color = NA,
                          obs_size = NA,
                          obs_fill_trans = NA, 
                          obs_line_trans = NA, 
                          linetypes = linetypes,
                          line_width = line_width,
                          legend_label_linetype = legend_label_linetype,
                          facet_ncol = facet_ncol, 
                          facet_nrow = facet_nrow,
                          floating_facet_scale = floating_facet_scale,
                          facet_spacing = facet_spacing,
                          time_range = time_range, 
                          x_axis_interval = x_axis_interval,
                          x_axis_label = x_axis_label,
                          pad_x_axis = pad_x_axis,
                          pad_y_axis = pad_y_axis,
                          y_axis_limits_lin = y_axis_limits_lin,
                          y_axis_limits_log = y_axis_limits_log, 
                          y_axis_interval = y_axis_interval,
                          y_axis_label = y_axis_label,
                          hline_position = hline_position, 
                          hline_style = hline_style, 
                          vline_position = vline_position, 
                          vline_style = vline_style,
                          graph_labels = graph_labels,
                          graph_title = graph_title,
                          graph_title_size = graph_title_size, 
                          legend_position = legend_position,
                          prettify_compound_names = prettify_compound_names,
                          existing_exp_details = existing_exp_details, 
                          return_caption = return_caption, 
                          save_graph = save_graph,
                          fig_height = fig_height,
                          fig_width = fig_width)
   
   return(Out)
   
}
shirewoman2/Consultancy documentation built on Feb. 18, 2025, 10 p.m.