In hadley/ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics

```{=html}

```r
library(ggplot2)
knitr::opts_chunk$set(
  fig.dpi = 300, 
  collapse = TRUE, 
  comment = "#>",
  fig.asp = 0.618,
  fig.width = 6,
  out.width = "80%")

Panes

What is the difference between `facet_wrap()` and `facet_grid()`?

The simplest answer is that you should use facet_wrap() when faceting by a single variable and facet_grid() when faceting by two variables and want to create a grid of panes.

See example

facet_wrap() is most commonly used to facet by a plot by a single categorical variable.

#| fig.alt = "A histogram showing the city miles per gallon distribution for 
#|  three types of drive train, each in their own panel in a 1-row, 3-column 
#|  layout."
ggplot(mpg, aes(x = cty)) +
  geom_histogram() +
  facet_wrap(~ drv)

And facet_grid() is commonly used to facet by a plot by two categorical variables.

#| fig.alt = "A histogram showing the city miles per gallon distribution. The 
#|  plot has twelve panels in a 4-row, 3-column layout, showing three types of 
#|  drive train in the horizontal direction, and four numbers of cylinders 
#|  in the vertical direction. Several panels have no data."
ggplot(mpg, aes(x = cty)) +
  geom_histogram() +
  facet_grid(cyl ~ drv)

Notice that this results in some empty panes (e.g. 4-wheel drive and 5 cylinders) as there are no cars in the mpg dataset that fall into such categories.

You can also use facet_wrap() with to facet by two categorical variables. This will only create facets for combinations of the levels of variables for which data exists.

#| fig.alt = "A histogram showing the city miles per gallon distribution. The 
#|  plot has nine panels in a 3-row, 3-column layout, showing all existing 
#|  combinations of three types of drive train, and four numbers of cylinders."
ggplot(mpg, aes(x = cty)) +
  geom_histogram() +
  facet_wrap(cyl ~ drv)

In facet_wrap() you can control the number of rows and/or columns of the resulting plot layout using the nrow and ncol arguments, respectively. In facet_grid() these values are determined by the number of levels of the variables you're faceting by.

Similarly, you can also use facet_grid() to facet by a single categorical variable as well. In the formula notation, you use a . to indicate that no faceting should be done along that axis, i.e. cyl ~ . facets across the y-axis (within a column) while . ~ cyl facets across the x-axis (within a row).

#| fig.alt = c(
#| "A histogram showing the city miles per gallon distribution. The plot has 
#|  four panels in a 4-row, 1-column layout, showing four numbers of cylinders.",
#| "A histogram showing the city miles per gallon distribution. The plot has 
#|  four panels in a 1-row, 4-column layout, showing four numbers of cylinders."
#| )
ggplot(mpg, aes(x = cty)) +
  geom_histogram() +
  facet_grid(cyl ~ .)

ggplot(mpg, aes(x = cty)) +
  geom_histogram() +
  facet_grid(. ~ cyl)

How can I place a vertical lines (`geom_vline()`) in each pane of a faceted plot?

First, calculate where the lines should be placed and save this information in a separate data frame. Then, add a geom_vline() layer to your plot that uses the summarized data.

See example

Suppose you have the following plot, and you want to add a vertical line at the mean value of hwy (highway mileage) for each pane.

#| fig.alt = "A histogram showing the highway miles per gallon distribution for 
#|  three types of drive train, each in their own panel in a 1-row, 3-column 
#|  layout."
ggplot(mpg, aes(x = hwy)) +
  geom_histogram(binwidth = 5) +
  facet_wrap(~ drv)

First, calculate these means and save them in a new data frame.

library(dplyr)

mpg_summary <- mpg %>%
  group_by(drv) %>%
  summarise(hwy_mean = mean(hwy))

mpg_summary

Then, add a geom_vline() layer to your plot that uses the summary data.

#| fig.alt = "A histogram showing the highway miles per gallon distribution for 
#|  three types of drive train, each in their own panel in a 1-row, 3-column 
#|  layout. Each panel has a vertical black line indicating the mean of the 
#|  distribution."
ggplot(mpg, aes(x = hwy)) +
  geom_histogram(binwidth = 5) +
  facet_wrap(~ drv) +
  geom_vline(data = mpg_summary, aes(xintercept = hwy_mean))

Axes

How can I set individual axis limits for facets?

Either let ggplot2 determine custom axis limits for the facets based on the range of the data you're plotting using the scales argument in facet_wrap() or facet_grid() or, if that is not sufficient, use expand_limits() to ensure limits include a single value or a range of values.

See example

Suppose you have the following faceted plot. By default, both x and y scales are shared across the facets.

#| fig.alt = "A scatter plot showing city miles per gallon on the x-axis and 
#|  highway miles per gallon on the y-axis. The plot has twelve panels in a 
#|  4-row, 3-column layout, showing three types of drive train in the 
#|  horizontal direction and four numbers of cylinders in the vertical 
#|  direction. Several panels are empty. Every row has the same y-axis range, 
#|  and every column has the same x-axis range."
ggplot(mpg, aes(x = cty, y = hwy)) +
  geom_point() +
  facet_grid(cyl ~ drv)

You can control this behaviour with the scales argument of faceting functions: varying scales across rows ("free_x"), columns ("free_y"), or both rows and columns ("free"), e.g.

#| fig.alt = "A scatter plot showing city miles per gallon on the x-axis and 
#|  highway miles per gallon on the y-axis. The plot has twelve panels in a 
#|  4-row, 3-column layout, showing three types of drive train in the 
#|  horizontal direction and four numbers of cylinders in the vertical 
#|  direction. Several panels are empty. Every row in the layout has an 
#|  independent y-axis range. Every column in the layout has an independent 
#|  x-axis range."
ggplot(mpg, aes(x = cty, y = hwy)) +
  geom_point() +
  facet_grid(cyl ~ drv, scales = "free")

If you also want to make sure that a particular value or range is included in each of the facets, you can set this with expand_limits(), e.g. ensure that 10 is included in the x-axis and values between 20 to 25 are included in the y-axis:

#| fig.alt = "A scatter plot showing city miles per gallon on the x-axis and 
#|  highway miles per gallon on the y-axis. The plot has twelve panels in a 
#|  4-row, 3-column layout, showing three types of drive train in the 
#|  horizontal direction and four numbers of cylinders in the vertical 
#|  direction. Several panels are empty. Every row in the layout has an 
#|  independent y-axis range, but all include the 20-25 interval. Every column 
#|  in the layout has an independent x-axis range, but all include 10."
ggplot(mpg, aes(x = cty, y = hwy)) +
  geom_point() +
  facet_grid(cyl ~ drv, scales = "free") +
  expand_limits(x = 10, y = c(20, 25))

Facet labels

How can I remove the facet labels entirely?

Set the strip.text element in theme() to element_blank().

See example

Setting strip.text to element_blank() will remove all facet labels.

#| fig.alt = "A scatter plot showing city miles per gallon on the x-axis and 
#|  highway miles per gallon on the y-axis. The plot has twelve panels in a 
#|  4-row, 3-column layout. The strips, or panel layout titles and 
#|  their backgrounds, are missing."
ggplot(mpg, aes(x = cty, y = hwy)) +
  geom_point() +
  facet_grid(cyl ~ drv) +
  theme(strip.text = element_blank())

You can also remove the labels across rows only with strip.x.text or across columns only with strip.y.text.

#| fig.alt = "A scatter plot showing city miles per gallon on the x-axis and 
#|  highway miles per gallon on the y-axis. The plot has twelve panels in a 
#|  4-row, 3-column layout. In the vertical direction, the panels indicate four 
#|  numbers of cylinders. The strips of the horizontal direction are missing."
ggplot(mpg, aes(x = cty, y = hwy)) +
  geom_point() +
  facet_grid(cyl ~ drv) +
  theme(strip.text.x = element_blank())

The facet labels in my plot are too long so they get cut off. How can I wrap facet label text so that long labels are spread across two rows?

Use label_wrap_gen() in the labeller argument of your faceting function and set a width (number of characters) for the maximum number of characters before wrapping the strip.

See example

In the data frame below we have 100 observations, 50 of them come from one group and 50 from another. These groups have very long names, and so when you facet the ploy by group, the facet labels (strips) get cut off.

#| fig.alt = "A histogram with two panels in a 1-row, 2-column layout of random 
#|  data. The first panel has as title 'A long group name for the first group'.
#|  The second panel has a title 'A muuuuuuuuuuuuuch longer group name for the 
#|  second group'. However, the second title is clipped to the panel width and
#|  doesn't show all the text."
df <- data.frame(
  x = rnorm(100),
  group = c(rep("A long group name for the first group", 50),
            rep("A muuuuuuuuuuuuuch longer group name for the second group", 50))
)

ggplot(df, aes(x = x)) +
  geom_histogram(binwidth = 0.5) +
  facet_wrap(~ group)

You can control the maximum width of the facet label by setting the width in the label_wrap_gen() function, which is then passed to the labeller argument of your faceting function.

#| fig.alt = "A histogram with two panels in a 1-row, 2-column layout of random 
#|  data. The first panel has as title 'A long group name for the first group'
#|  in two lines of text. The second panel has a title 'A muuuuuuuuuuuuuch 
#|  longer group name for the second group' in three lines of text. The width
#|  of the second title now fits within the panel width."
ggplot(df, aes(x = x)) +
  geom_histogram(binwidth = 0.5) +
  facet_wrap(~ group, labeller = labeller(group = label_wrap_gen(width = 25)))

How can I set different axis labels for facets?

Use as_labeller() in the labeller argument of your faceting function and then set strip.background and strip.placement elements in the theme() to place the facet labels where axis labels would go. This is a particularly useful solution for plotting data on different scales without the use of double y-axes.

See example

Suppose you have data price data on a given item over a few years from two countries with very different currency scales.

df <- data.frame(
  year = rep(2016:2021, 2),
  price = c(10, 10, 13, 12, 14, 15, 1000, 1010, 1200, 1050, 1105, 1300),
  country = c(rep("US", 6), rep("Japan", 6))
)

df

You can plot price versus time and facet by country, but the resulting plot can be a bit difficult to read due to the shared y-axis label.

#| fig.alt = "A timeseries plot showing price over time for two countries, Japan
#|  and the US, in two panels in a 2-row, 1-column layout. The countries are
#|  indicated at the top of each panel. The two y-axes have different ranges."
ggplot(df, aes(x = year, y = price)) +
  geom_smooth() +
  facet_wrap(~ country, ncol = 1, scales = "free_y") +
  scale_x_continuous(breaks = 2011:2020)

With the following you can customize the facet labels first with as_labeller(), turn off the default y-axis label, and then place the facet labels where the y-axis label goes ("outside" and on the "left").

#| fig.alt = "A timeseries plot showing price over time for two countries and 
#|  their currencies, the Japanese Yen and the US Dollar, in two panels in a 
#|  2-row, 1-column layout. The countries and currency units are indicated at 
#|  the left of each panel. The two y-axes have different ranges."
ggplot(df, aes(x = year, y = price)) +
  geom_smooth() +
  facet_wrap(~ country, ncol = 1, scales = "free_y", 
             labeller = as_labeller(
               c(US = "US Dollars (USD)", Japan = "Japanese Yens (JPY)")), 
             strip.position = "left"
             ) +
  scale_x_continuous(breaks = 2011:2020) +
  labs(y = NULL) +
  theme(strip.background = element_blank(), strip.placement = "outside")

hadley/ggplot2 documentation built on May 4, 2024, 2:17 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

hadley/ggplot2
Create Elegant Data Visualisations Using the Grammar of Graphics

In hadley/ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics

Panes

What is the difference between `facet_wrap()` and `facet_grid()`?

How can I place a vertical lines (`geom_vline()`) in each pane of a faceted plot?

Axes

How can I set individual axis limits for facets?

Facet labels

How can I remove the facet labels entirely?

The facet labels in my plot are too long so they get cut off. How can I wrap facet label text so that long labels are spread across two rows?

How can I set different axis labels for facets?

R Package Documentation

Browse R Packages

We want your feedback!

hadley/ggplot2 Create Elegant Data Visualisations Using the Grammar of Graphics

In hadley/ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics

Panes

What is the difference between facet_wrap() and facet_grid()?

How can I place a vertical lines (geom_vline()) in each pane of a faceted plot?

Axes

How can I set individual axis limits for facets?

Facet labels

How can I remove the facet labels entirely?

The facet labels in my plot are too long so they get cut off. How can I wrap facet label text so that long labels are spread across two rows?

How can I set different axis labels for facets?

R Package Documentation

Browse R Packages

We want your feedback!

hadley/ggplot2
Create Elegant Data Visualisations Using the Grammar of Graphics

What is the difference between `facet_wrap()` and `facet_grid()`?

How can I place a vertical lines (`geom_vline()`) in each pane of a faceted plot?