knitr::opts_chunk$set( collapse = TRUE, message = FALSE, warning = FALSE, comment = "#>", fig.path = "README_supp/README-" )
Capture the spirit of your
ggplot2::ggplot() stores the information needed to build the graph as a
grob, but that's what the computer needs to know about in order to build the graph. As humans, we're more interested in what commands were issued in order to build the graph. For good reproducibility, the calls need to be applied to the relevant data. While this is somewhat available by deconstructing the
grob, it's not the simplest approach.
Here is one option that solves that problem.
ggghost stores the data used in a
ggplot() call, and collects
ggplot2 commands (usually separated by
+) as they are applied, in effect lazily collecting the calls. Once the object is requested, the
You can install
ggghost from CRAN with:
or the development version from github with:
# install.packages("devtools") devtools::install_github("jonocarroll/ggghost")
%g<% to initiate storage of the
ggplot2 calls then add to the call with each logical call on a new line (@hrbrmstr style)
tmpdata <- data.frame(x = 1:100, y = rnorm(100)) head(tmpdata)
library(ggplot2) library(ggghost) z %g<% ggplot(tmpdata, aes(x, y)) z <- z + geom_point(col = "steelblue") z <- z + theme_bw() z <- z + labs(title = "My cool ggplot") z <- z + labs(x = "x axis", y = "y axis") z <- z + geom_smooth()
This invisibly stores the
ggplot2 calls in a list which can be reviewed either with the list of calls
or the concatenated call
summary(z, combine = TRUE)
The plot can be generated using a
which re-evaluates the list of calls and applies them to the saved data, meaning that the plot remains reproducible even if the data source is changed/destroyed.
The call list can be subset, removing parts of the call
Plot features can be removed by name, a task that would otherwise have involved re-generating the entire plot
z2 <- z + geom_line(col = "coral") z2 - geom_point()
Calls are removed based on matching to the regex
\\(.*$ (from the first bracket to the end of the call), so arguments are irrelevant.
The object still generates all the
grob info, it's just stored as calls rather than a completed image.
str(print(z)) #> [... truncated ...]
grob info is still produced, normal
ggplot2 operators can be applied after the
xvals <- seq(0,2*pi,0.1) tmpdata_new <- data.frame(x = xvals, y = sin(xvals)) print(z - geom_smooth()) %+% tmpdata_new
ggplot2 calls still work as normal if you want to avoid storing the calls.
ggplot(tmpdata) + geom_point(aes(x,y), col = "red")
Since the object is a list, we can stepwise show the process of building up the plot as a (re-)animation
A supplementary data object (e.g. for use in a
scale_* call) can be added to the
myColors <- c("alpha" = "red", "beta" = "blue", "gamma" = "green") supp_data(z) <- myColors
These will be recovered along with the primary data.
For full reproducibility, the entire structure can be saved to an object for re-loading at a later point. This may not have made much sense for a
ggplot2 object, but now both the original data and the calls to generate the plot are saved. Should the environment that generated the plot be destroyed, all is not lost.
saveRDS(z, file = "README_supp/mycoolplot.rds") rm(z) rm(tmpdata) rm(myColors) exists("z") exists("tmpdata") exists("myColors")
ggghost object back to the session, both the relevant data and plot-generating calls can be re-executed.
z <- readRDS("README_supp/mycoolplot.rds") str(z) recover_data(z, supp = TRUE) head(tmpdata) myColors z
We now have a proper reproducible graphic.
ggplot2call, not piped in to it. Pipelines such as
z %g<% tmpdata %>% ggplot()won't work... yet.
ggplot(data = x)call. If you require supplementary data for some
geomthen you need manage storage/consistency of that.~~ (fixed)
labscalls, an argument must be present. It doesn't need to be the actual one (all will be removed) but it must evaluate in scope.
TRUEwill do fine.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.