library(learnr) library(gradethis) knitr::opts_chunk$set( echo = FALSE, exercise.warn_invisible = FALSE ) # enable code checking tutorial_options( exercise.checker = grade_learnr, exercise.lines = 20, exercise.reveal_solution = TRUE )
In our new summary function, alter the code so that the columns are no longer prefixed with
value_
.
penguins %>% pivot_longer(ends_with("mm")) %>% group_by(name) %>% summarise(across(value, .fns = list(mean = mean, sd = sd, min = min, max = max), na.rm = TRUE, n = length(species) )
penguins %>% pivot_longer(ends_with("mm")) %>% group_by(name) %>% summarise(across(value, .fns = list(mean = mean, sd = sd, min = min, max = max), na.rm = TRUE, .names = "{.fn}"), n = length(species) )
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Try using the `.names` argument in across.
The internal placeholder for the function names in across is `.fn`.
Adapt the code so that
NA
s in the values are removed during the pivot_longer, and try to removena.rm = TRUE
from the across. Does that work? Why?
penguins %>% pivot_longer(ends_with("mm")) %>% group_by(name) %>% summarise(across(value, .fns = list(mean = mean, sd = sd, min = min, max = max), .names = "{.fn}"), n = length(species) )
penguins %>% pivot_longer(ends_with("mm"), values_drop_na = TRUE) %>% group_by(name) %>% summarise(across(value, .fns = list(mean = mean, sd = sd, min = min, max = max), .names = "{.fn}"), n = length(species) )
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Check out the `values_drop_na` argument in pivot longer
Adapt the code so that the "n" is captured within the across function list, like the other four functions. How is the output different?
penguins %>% pivot_longer(ends_with("mm")) %>% group_by(name) %>% summarise(across(value, .fns = list(mean = mean, sd = sd, min = min, max = max), .names = "{.fn}"), n = length(species) )
penguins %>% pivot_longer(ends_with("mm"), values_drop_na = TRUE) %>% group_by(name) %>% summarise(across(value, .fns = list(mean = mean, sd = sd, min = min, max = max, n = length), .names = "{.fn}") )
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Try grouping by more variables. Is the outcome as you expect?
penguins %>% pivot_longer(ends_with("mm"), values_drop_na = TRUE) %>% group_by(name) %>% summarise(across(value, .fns = list(mean = mean, sd = sd, min = min, max = max, n = length), .names = "{.fn}") )
Create a bar chart based om the penguins summary data, where the mean values are on the x axis and species are on the y axis. Make sure to dodge the bar for easier comparisons. Create subplots on the different metrics.
__ %>% ggplot(aes(x = __, y = __, fill = __)) + geom_bar(stat = __) + facet___(~ __, scales = "free")
penguins_sum %>% ggplot(aes(x = mean, y = species, fill = island)) + geom_bar(stat = "identity", position = "dodge") + facet_wrap(~ name, scales = "free")
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Adapt the code the plot the standard deviation rather than the mean
penguins_sum %>% ggplot(aes(x = mean, y = species, fill = island)) + geom_bar(stat = "identity", position = "dodge") + facet_wrap(~ name, scales = "free")
penguins_sum %>% ggplot(aes(x = sd, y = species, fill = island)) + geom_bar(stat = "identity", position = "dodge") + facet_wrap(~ name, scales = "free")
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Swap around the plot to explore different ways of looking at the same data. Try swapping the axes, so that x is on the y, and vice versa. Or Try having species on one axis and using island as colour fill. What variation do you think best shows the differences between groups?
penguins_sum %>% ggplot(aes(x = species, y = sd, fill = island)) + geom_bar(stat = "identity", position = "dodge") + facet_wrap(~ name, scales = "free")
Try adding an error bar that indicates the standard deviation of the measurement, while the bars indicate the mean values. What does the geom_errorbar need to plot the error bars?
penguins_sum %>% ggplot(aes(x = mean, y = species, fill = island, group = __)) + geom_bar(stat = "identity", position = "dodge") + geom___(aes(__ = _ , __ = _), position = "dodge") + facet_wrap(~ name, scales = "free")
penguins_sum %>% ggplot(aes(x = mean, y = species, fill = island, group = island)) + geom_bar(stat = "identity", position = "dodge") + geom_errorbar(aes(xmin = mean - sd , xmax = mean + sd), position = "dodge") + facet_wrap(~ name, scales = "free")
grade_code( correct = random_praise(), incorrect = random_encouragement() )
the geom for error bars is called `geom_errorbar`
Error bars are lines between two points. Try looking for a way to ass the minimum and maximum values for the x-axis.
Try pivoting the data even longer! Pivot longer all the stats columns so that the column names are in a column named "stat". Now, create another plot using the "value" column on the y-axis, and creating subplots based on both the observation name AND the stat!
penguins_sum %>% pivot_longer(all_of(c(__)), __ = __) %>% ggplot(aes(x = mean, y = species, fill = island, group = __)) + geom_bar(stat = "identity", position = "dodge") + facet___( __ ~ name, scales = "free")
penguins_sum %>% pivot_longer(all_of(c("mean", "sd", "min", "max")), names_to = "stat") %>% ggplot(aes(x = species, y = value, fill = island)) + geom_bar(stat = "identity", position = "dodge") + facet_grid(stat ~ name, scales = "free")
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Try adding the names of the columns to pivot in quotation marks within the c().
Try the facet_grid function, which takes the syntax `facet_grid(row_column ~ columns_column).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.