knitr::opts_chunk$set( collapse = TRUE, comment = "#>")
To get started, install plotscaper
with:
#| eval: false devtools::install_github("bartonicek/plotscaper")
Next, open up RStudio and run the following code:
#| fig-width: 7 #| fig-height: 5 library(plotscaper) names(airquality) <- c("ozone", "solar radiation", "wind", "temperature", "month", "day") create_schema(airquality) |> add_scatterplot(c("solar radiation", "ozone")) |> add_barplot(c("day", "ozone"), list(reducer = "max")) |> add_histogram(c("wind")) |> add_pcoords(names(airquality)[1:4]) |> render()
Try clicking and dragging to select some points in the scatterplot. You should see the corresponding cases get highlighted within the barplot!
There are many other ways interacting with plotscaper
figures, including:
Importantly, many of these interactions can either be done manually, by interacting with the figure (client-side), or programmatically, by calling functions from inside a running R session ("server-side").
To take full advantage of plotscaper
, you need to understand two core functions: create_schema
and render
. We'll explore this on the example of the penguins
data set [@horst2020].
The create_schema
function initializes a schema - a sort of a recipe which we can use to define the figure. Like in other data visualization packages, we build up the schema step-by-step, by calling functions that append additional information to it:
library(palmerpenguins) penguins <- na.omit(penguins) # missing data is not supported yet, unfortunately names(penguins) <- names(penguins) |> gsub("(_mm|_g)", "", x = _) schema <- create_schema(penguins) |> add_scatterplot(c("body_mass", "flipper_length")) |> add_barplot(c("species")) |> add_fluctplot(c("species", "sex")) |> add_histogram(c("bill_length"))
The schema then is really just a list of messages:
schema
Typically, you'll use the schema to add more plots. However, you can also do other things such as select cases or set axis limits:
schema <- schema |> assign_cases(which(penguins$species == "Adelie")) |> set_scale("plot1", "x", min = 0) |> set_scale("plot1", "size", max = 5) schema
In fact, most of the things you can do with the schema you can also do interactively, and vice versa. We will see that in the next section.
When it finally comes a time to turn the recipe into an actual interactive figure or scene, we can do so by calling the render
function:
scene <- schema |> render() scene
The render
function takes the schema and turns it into a htmlwidgets
widget that we can interact with in RStudio viewer or embed in RMarkdown documents. However, it can also do a little bit more than that.
If you're following along in RStudio, try this:
#| eval: false scene |> select_cases(1:10)
You should see some cases of the data get highlighted. Importantly, notice that the figure does not get re-rendered!
Whereas the create_schema
merely initializes a recipe which is just a regular R object (a list
), the render
function renders the recipe into an htmlwidgets
widget, and, if inside a running R session, it also launches an httpuv
server for direct communication between the R session and the figure.
We can use (mostly) the same functions to modify schema (created by call to create_schema
) and scene (created by call to render
). The difference is that while calling functions with a schema as the first argument merely appends the corresponding message to the recipe, calling functions with scene immediately sends a message to the figure via the server.
That is, while the following code:
#| eval: false # NOT RUN scene <- schema |> select_cases(20:30) |> render() scene scene
causes two full re-renders, the following code:
#| eval: false # NOT RUN scene <- schema |> render() scene |> select_cases(20:30) scene |> select_cases(20:30)
causes only one render (if inside a running R session).
In other words, plotscaper
functions such as add_*
, set_selected
, and set_scale
behave differently based on whether we call them on scene or schema:
The second method only works if we are in an interactive R session because we need a server to communicate with the figure.
This is why running the following code (while the document you're reading is being knitted) throws an error:
#| error: true interactive() scene |> select_cases(1:10)
When we knit an RMarkdown document, we generate a static HTML file. By default, we cannot communicate with this file since it's just a big blob of HTML, CSS, and JavaScript. To only way to change the file is to rewrite it. Thus, in RMarkdown, we can really only really write the schema and render
- we can't do anything with the figure once it's been rendered.
In contrast, inside an interactive R session, we can launch a server that will listen and respond to messages from the R session and send them to the figure (and also send messages from the figure back to the R session - this is done via WebSockets).
This also means there are some functions that it only make sense to use inside a running R session. For example, the following functions query the selection status of the figure:
#| eval: false scene |> selected_cases() scene |> assigned_cases()
That means that you can, for example, render a figure, select some cases of the data using a mouse, and then call scene |> selected_cases()
to get the row indices of those cases. This doesn't really make sense when writing a schema - the only way to select cases is via an explicit call to select_cases
, so we would be querying cases that we have already specified before.
Likewise, there are functions such as pop_plot
and remove_plot
which can be used to remove plots from scene. These don't really make sense to use while writing a schema - if you know that you don't want a specific plot, you can just delete the line which adds it to the recipe. However, they do make sense when interacting with a scene inside a running R session. Perhaps you found some interesting trend in your data and want to see if it holds in other plots, but you're running out of space in the viewer - you can pop_plot
to remove the last plot and add_*
to add a new plot, all the while keeping the rest of the state of the figure intact!
Sometimes, when rendering multiple figures quickly, the figure may fail to connect to the server or the server may failed to be launched altogether. Often, the cause is that the default port number is already taken. I will try to fix this bug in the future, however, for the time being, you can always fix it by running either of the following functions:
#| eval: false start_server(random_port = TRUE) # Starts a server on a new random port
or:
#| eval: false httpuv::stopAllServers() # Stops all servers, now you should be able to relaunch the server
Since the schema is lazy, we can use it to generate figures programmatically. For example, here's how we could create an interactive scatterplot matrix (SPLOM) of the penguins data set:
schema <- create_schema(penguins) keys <- names(penguins)[4:6] # Loop through combinations of columns for (i in seq_along(keys)) { for (j in seq_along(keys)) { # Add a scatterplot if row & column no.'s are different if (i != j) schema <- schema |> add_scatterplot(c(keys[i], keys[j])) # Add a histogram if row & column no.'s match else schema <- schema |> add_histogram(c(keys[i])) } } # Options to make the plots fit better within the available space opts <- list(size = 5, axis_title_size = 0.75, axis_label_size = 0.5) schema |> render(options = opts)
Just to re-iterate the point from the previous section, we could also do this interactively, by writing out the calls to add the plots ourselves:
#| eval: false scene <- create_schema(penguins) |> render(opts) scene |> add_histogram(c("bill_depth")) scene |> add_scatterplot(c("bill_depth", "flipper_length")) scene |> add_scatterplot(c("bill_depth", "body_mass")) ...
However, having to do this for all nine plots might quickly get tedious.
As such, there are different reasons why you might want to do something using a scene, a schema, or some combination of both. If you're writing an RMarkdown document, you don't really have a choice - you can't do anything to a scene once it's rendered.
Inside an interactive R session, you have more options. You can decide you first want to create a highly customized schema and only then start interacting with the figure live. Or you can just immediately fire off scene <- create_schema(data) |> render()
and do everything interactively.
Each approach has some advantages and some disadvantages. With the schema way, you can always re-create most of the state, so if you mess up and need to go back, you can just print scene
and you're good to go. With the interactive scene way, you're more flexible and you have the immediate feedback in seeing how the figure changes in front of you, however, it may be more difficult to recover some state if there are many intermediate steps.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.