Welcome to this vignette!
require(knitr)
You can load the package with this command
require(ggplot2) require(ggBubbles)
The package introduces position_surround()
for r CRANpkg("ggplot2")
.
Parameter is offset
which controls the offsets for
position corrections (default is 0.1).
position_surround()
can be used in many r CRANpkg("ggplot2")
functions like
geom_point
or geom_text
:
suppressPackageStartupMessages({ require(dplyr) require(tibble) }) data(MusicianInterestsSmall) ggplot(data = MusicianInterestsSmall, aes(x = Instrument, y = Genre, col = Level)) + geom_point(position = position_surround(), size = 4) + scale_colour_manual(values = c("#00e5ff", "#4694ff", "#465aff", "#2c00c9")) + theme_bw(base_size = 18)
Here we demonstrate the advantage of MiniBubble plots compared to traditional Bubbleplot in certain usecases with discrete data.
Please not that in this vignette we will use r CRANpkg("dplyr")
and r CRANpkg("tibble")
from r CRANpkg("tidyverse")
.
require(dplyr) require(tibble)
First, we load a small example data
data(MusicianInterestsSmall)
which contains data from musicians about their experience in differente music genres they have with their music instruments.
head(MusicianInterestsSmall)
kable(head(MusicianInterestsSmall), caption = "First rows of MusicianInterestsSmall")
The traditional bubble plot is able to portrait the amount of guitarrists or pianists able to play jazz or classical music by size and display the average experience level by colour coding.
ggplot(data = MusicianInterestsSmall %>% group_by(Instrument, Genre) %>% summarize(Count = n(), AvgLevel = mean(as.integer(Level))), aes(x = Instrument, y = Genre, size = Count, col = AvgLevel)) + geom_point() + theme_bw(base_size = 18) + scale_colour_gradientn( colours = rev(topo.colors(2)), na.value = "transparent", breaks = as.integer(MusicianInterestsSmall$Level) %>% unique %>% sort, labels = levels(MusicianInterestsSmall$Level), limits = c(as.integer(MusicianInterestsSmall$Level) %>% min, as.integer(MusicianInterestsSmall$Level) %>% max)) + scale_size_continuous(range = c(3, 11))
From a data visualisation point of view, it is debateable how good point sizes are to display counts. However, in general we can agree that averages often hide a lot of useful information.
The MiniBubble plot allows to show each musician and their corresponding skill level individually:
ggplot(data = MusicianInterestsSmall, aes(x = Instrument, y = Genre, col = Level)) + geom_point(position = position_surround(), size = 4) + scale_colour_manual(values = c("#00e5ff", "#4694ff", "#465aff", "#2c00c9")) + theme_bw(base_size = 18)
This is done by the position_surround()
function passed to the position
argument of geom_point
. Note, that only exact overlaps will be dodged. The points will surround the center in layers which will be filled clockwise.
Since each individual data point is shown seperately, you can also use shape
and fill
to show further features, as long the plot will not be overloaded with information.
Also, you can use geom_text(position = position_surround())
to overlay the points
with text, or make the text appear in shiny when hovering.
MiniBubbleplot allows to show more features in a bubble plot.
The offset of the dodged points can be handed as parameters to position_surround()
.
ggplot(data = MusicianInterestsSmall, aes(x = Instrument, y = Genre, col = Level)) + geom_point(position = position_surround(offset = .2), size = 4) + scale_colour_manual(values = c("#00e5ff", "#4694ff", "#465aff", "#2c00c9")) + theme_bw(base_size = 18)
We load a bigger test data set:
data(MusicianInterests)
This dataset also contains information about the musicians themselves from the multiple - choice survey.
head(MusicianInterests)
kable(head(MusicianInterests), caption = "First rows of MusicianInterests")
The basis of the plot is simply:
p <- ggplot(data = MusicianInterests, aes(x = Genre, y = Instrument, col = Level)) + geom_point(size = 1.8, position = position_surround(offset = .2))
Here we add some graphical parameters to make it pretty:
p <- p + theme_bw(base_size = 17) + theme(axis.text.x = element_text(angle = 45, hjust = 1)) + scale_colour_gradientn( colours = rev(topo.colors(2)), na.value = "transparent", breaks = 1:6, labels = c("Interested", "Beginner", "Intermediate", "Experienced", "Very experienced", "Pro")) + xlab("") + ylab("") p
The position_surround()
algorithm determines how many points are overlaying and then
displays the points in clockwise ration around the center in quadratic layers. Please
see the graphical illustration here:
n <- 25 X <- data.frame(x = rep(1, sum(1:n)), y = rep(1, sum(1:n)), group = unlist(lapply(1:n, function(x) { rep(x, x) })), label = unlist(lapply(1:n, function(x) { 1:x })) ) ggplot(data = X, aes(x = x, y = y, label = label)) + facet_wrap(~group) + geom_text(position = position_surround(offset=.4)) + theme_minimal() + theme( strip.background = element_blank(), strip.text.x = element_blank() ) + xlim(c(0,2)) + ylim(c(0,2))
For feedback or suggestions please contact the maintainer: Thomas Schwarzl thomas@schwarzl.net or schwarzl@embl.de.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.