get_centroids | R Documentation |
Runs a summarizing function for each specified column, for each specified group. This is intended to be used to plot centroids in ellipses in ggplot2 without having to create a new object or have a lot of in-line code. See examples below.
get_centroids(df, .cols, ..., .fns = median)
df |
a dataframe. |
.cols |
columns that should be summarized. For sociophonetic data, this
is usually the names of your vowel columns, e.g. |
... |
grouping variables. For sociophonetic data, this might be speaker and allophone or something. This is just passed into 'group_by'. |
.fns |
one or more names of functions. By default, |
an ungrouped dataframe
Okay technically this function name is a misnomer because we're not truly getting centroids in a mathematical sense. But that's what I think of when I run this so that's what we're going with.
library(tidyverse)
df <- joeysvowels::idahoans
# Basic usage as a summarizing function
df %>%
get_centroids(c(F1, F2), vowel)
# Within a ggplot2 block. Note that you do have to start the data argument with the dot and pipe it into get_centroids, rather than incorporating it in (i.e. get_centroids(., vowel)). Not sure why but this appears to be a contraint imposed by ggplot2.
ggplot(df, aes(F2, F1, color = vowel)) +
stat_ellipse(level = 0.67) +
geom_text(data = . %>% get_centroids(c(F1, F2), vowel), aes(label = vowel)) +
scale_x_reverse() +
scale_y_reverse() +
theme(legend.position = "none")
# You can add multiple groups to the code too.
ggplot(df, aes(F2, F1, color = vowel)) +
stat_ellipse(level = 0.67) +
geom_text(data = . %>% get_centroids(c(F1, F2), speaker, vowel), aes(label = vowel)) +
scale_x_reverse() +
scale_y_reverse() +
facet_wrap(~speaker, scales = "free") +
theme(legend.position = "none")
# Like any use of group_by(), additional, perhaps redundant columns may be specified for the purpose of "passing them through." In this example, adding tense_lax doesn't change the calculations, but it's useful for this plot. Additionally, this block of code highlights one strength of get_centroids, and that is that I can pass in a modified dataframe directly to ggplot and then modify it even further to get the labels, without needing to create any new objects.
df %>%
mutate(tense_lax = fct_collapse(vowel,
"tense" = c("IY", "EY", "AO", "OW", "UW"),
"lax" = c("IH", "EH", "AE", "AA", "AH", "UH"))) %>%
ggplot(aes(F2, F1, color = tense_lax, group = vowel)) +
stat_ellipse(level = 0.67) +
geom_text(data = . %>% get_centroids(c(F1, F2), speaker, tense_lax, vowel),
aes(label = vowel)) +
scale_x_reverse() +
scale_y_reverse() +
facet_wrap(~speaker, scales = "free") +
theme(legend.position = "none")
# For column selection, any tidyselect output works, such as matches().
df %>%
get_centroids(matches("F\\d"), speaker, vowel)
# For functions, you can add more than one. Just wrap them up into c().
df %>%
get_centroids(c(F1, F2), .fns = c(median, mean), speaker, vowel)
# However, unless they are named, they won't be useful.
df %>%
get_centroids(c(F1, F2), .fns = c(`med` = median, `average` = mean), speaker, vowel)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.