#| label = "setup", #| include = FALSE knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE )
You can cite this package/vignette as:
#| label = "citation", #| echo = FALSE, #| comment = "" citation("ggstatsplot")
combine_plots
The full power of {ggstatsplot}
can be leveraged with a functional programming
package like {purrr}
that replaces for
loops
with code that is both more succinct and easier to read and, therefore, {purrr}
should be preferrred 😻.
In such cases, {ggstatsplot}
contains a helper function combine_plots
to
combine multiple plots, which can be useful for combining a list of plots
produced with {purrr}
. This is a wrapper around patchwork::wrap_plots
and lets
you combine multiple plots and add a combination of title, caption, and
annotation texts with suitable defaults.
Note Before: If you have just one grouping variable and you'd like a plot
for each factor
of this variable the grouped_
variants
(https://indrajeetpatil.github.io/ggstatsplot/reference/index.html) of all
{ggstatsplot}
functions will allow you do to this. They specifically use the
combine_plots
function under the covers.
dplyr::group_map
The easiest way to run the same {ggstatsplot}
operation across multiple grouping
variables is by using dplyr::group_map
functions and then - of course - one
would like to combine these plots in a single plot.
#| label = "ggscatterstats_groupmap", #| fig.height = 6, #| fig.width = 10 # libraries library(tidyverse) # creating a list of plots p_list <- mtcars %>% dplyr::filter(cyl != 4) %>% dplyr::group_by(cyl) %>% dplyr::group_map(.f = ~ ggbetweenstats(data = ., x = am, y = wt)) # combining plots combine_plots( plotlist = p_list, annotation.args = list(tag_levels = list(as.vector(rlang::set_names(levels(as.factor(mtcars$cyl)))))) )
{purrr}
The full power of {ggstatsplot}
can be leveraged with a functional programming
package like {purrr}
which can replace many for
loops, is more succinct, and easier to read. Consider {purrr}
as your first
choice for combining multiple plots.
An example using the iris
dataset is provided below. Imagine that we want to
separately plot the linear relationship between sepal length and sepal width for
each of the three species but combine them into one consistent plot with common
labeling and as one plot. Rather than call ggscatterstats
three times and
gluing the results or using patchwork
directly, we'll create a tibble
called
plots
using purrr::map
then feed that to combine_plots
to get our combined
plot.
#| label = "ggscatterstats_purrr", #| fig.height = 12, #| fig.width = 8 # creating a list column with `{ggstatsplot}` plots plots <- iris %>% dplyr::mutate(Species2 = Species) %>% # just creates a copy of this variable dplyr::group_by(Species) %>% tidyr::nest() %>% # a nested data frame with list column called `data` dplyr::mutate( plot = data %>% purrr::map( .x = ., .f = ~ ggscatterstats( data = ., x = Sepal.Length, y = Sepal.Width, title = glue::glue("Species: {.$Species2} (n = {length(.$Sepal.Length)})") ) ) ) # display the new object plots # creating a grid with patchwork combine_plots( plotlist = plots$plot, plotgrid.args = list(nrow = 3, ncol = 1), annotation.args = list( title = "Relationship between sepal length and width for each Iris species", caption = expression( paste( italic("Note"), ": Iris flower dataset was collected by Edgar Anderson.", sep = "" ) ) ) )
plyr
Another popular package for handling big datasets is plyr
, which allows us to
repeatedly apply a common function on smaller pieces and then combine the
results into a larger whole.
In this example we'll start with the gapminder
dataset. We're interested in
the linear relationship between Gross Domestic Product (per capita) and life
expectancy in the year 2007, for all the continents except Oceania. We'll use
dplyr
to filter to the right rows then use plyr
to repeat the
ggscatterstats
function across each of the 4 continents remaining. The result
is of that is a list of plots called plots
. We then feed plots
to the
combine_plots
function to merge them into one plot. We will call attention to
the countries which have very low life expectancy (< 45 years) by labeling those
countries when they occur.
#| label = "ggscatterstats_plyr", #| fig.height = 12, #| fig.width = 12 library(plyr) library(gapminder) # let's have a look at the structure of the data dplyr::glimpse(gapminder::gapminder) # creating a list of plots plots <- plyr::dlply( dplyr::filter(gapminder::gapminder, year == 2007, continent != "Oceania"), .variables = .(continent), .fun = function(data) { ggscatterstats( data = data, x = gdpPercap, y = lifeExp, xfill = "#0072B2", yfill = "#009E73", label.var = "country", label.expression = "lifeExp < 45", title = glue::glue("Continent: {data$continent}") ) + ggplot2::scale_x_continuous(labels = scales::comma) } ) # combining individual plots combine_plots( plotlist = plots, annotation.args = list(title = "Relationship between GDP (per capita) and life expectancy"), plotgrid.args = list(nrow = 2) )
The combine_plots
function can also be useful for adding additional textual
information that can not be added by making a single call to a {ggstatsplot}
function via the title, subtitle, or caption options. For this example let's
assume we want to assess the relationship between a movie's rating and its
budget from the Internet Movie Database using polynomial regression.
ggcoefstats
will do most of the work, including the title, subtitle, and
caption. But we want to add at the bottom an annotation to show the formula we
are using for our regression. combine_plots
allows us to add sub.text =
to
accomplish that task as shown in the resulting plot.
#| label = "ggbetweenstats_subtext", #| fig.height = 8, #| fig.width = 8 combine_plots( plotlist = list(ggcoefstats( x = stats::lm( formula = rating ~ stats::poly(budget, degree = 3), data = dplyr::sample_frac(movies_long, size = 0.2), na.action = na.omit ), exclude.intercept = FALSE, title = "Relationship between movie budget and IMDB rating", subtitle = "Source: Internet Movie Database", ggtheme = ggplot2::theme_gray(), stats.label.color = c("#CC79A7", "darkgreen", "#0072B2", "red") ) + # modifying the plot outside of ggstatsplot using ggplot2 functions ggplot2::scale_y_discrete( labels = c( "Intercept (c)", "1st degree (b1)", "2nd degree (b2)", "3rd degree (b3)" ) ) + ggplot2::labs(y = "term (polynomial regression)")), # adding additional text element to the plot since title, subtitle, caption are all already occupied annotation.args = list( caption = expression( paste( "linear model: ", bolditalic(y), " ~ ", italic(c) + italic(b)[1] * bolditalic(x) + italic(b)[2] * bolditalic(x)^ 2 + italic(b)[3] * bolditalic(x)^3, sep = "" ) ) ) )
If you find any bugs or have any suggestions/remarks, please file an issue on GitHub
:
https://github.com/IndrajeetPatil/ggstatsplot/issues
For details, see- https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/session_info.html
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.