library(learnr) library(gradethis) knitr::opts_chunk$set( echo = FALSE, exercise.warn_invisible = FALSE ) # enable code checking tutorial_options( exercise.checker = grade_learnr, exercise.lines = 8, exercise.reveal_solution = TRUE )
Create a column named
bill_ld_ratio
that is the value ofbill_length_mm
divided bybill_depth_mm
penguins %>% mutate(_ = _ / _) %>% select(species, island, contains("bill"))
penguins %>% mutate(bill_ld_ratio = bill_length_mm / bill_depth_mm) %>% select(species, island, contains("bill"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
This exercise expects piped data into the mutate function
Make sure you have given the new column the correct name
transform the body mass column to the logarithmic scale using the
log
function
penguins %>% mutate(body_mass_g_log = __(body_mass_g))
penguins %>% mutate(body_mass_g_log = log(body_mass_g))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
use the `log()` function around body mass.
Divide all values in the flipper lenght column with 10, and store it in a variable called flipper_length_cm.
penguins %>% mutate(_ = _/10)
penguins %>% mutate(flipper_length_cm = flipper_length_mm/10)
grade_code( correct = random_praise(), incorrect = random_encouragement() )
use the `log()` function around body mass.
Adapt the code below to evaluate if body mass is below 4.5kg, and assign rows that are TRUE to be "normal" and rows that are FALSE to "large"
penguins %>% mutate(body_type = ifelse(body_mass_g _ 4500, "_", "_")) %>% select(species, island, contains("body"))
penguins %>% mutate(body_type = ifelse(body_mass_g < 4500, "normal", "large")) %>% select(species, island, contains("body"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
After the logical expressions in ifelse, the first value should be the `TRUE` value, and the second the `FALSE`
penguins %>% mutate(biscoe = ifelse(island __ "Biscoe", __, __)) %>% select(species, island, biscoe)
penguins %>% mutate(biscoe = ifelse(island == "Biscoe", TRUE, FALSE)) %>% select(species, island, biscoe)
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Have you used the correct sign for equals `==`
in ifelse, after the logical expressions the first value should be the `TRUE` value, and the second the `FALSE`.
Adapt the below code so that penguins with body mass below 3 kg are "petite"
penguins %>% mutate( body_type = case_when( body_mass_g _ 4500 ~ "large", body_mass_g _ 3000 ~ "petite", _ ~ "normal") # the rest ) %>% select(species, island, contains("body"))
penguins %>% mutate( body_type = case_when( body_mass_g > 4500 ~ "large", body_mass_g < 3000 ~ "petite", !is.na(body_mass_g) ~ "normal") # the rest ) %>% select(species, island, contains("body"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Have you used the correct sign for 'larger than'?
Have you used the correct sign for 'smaller than'?
evaluate if something is NA with the `is.na` function.
To flip a logical, so that TRUE becomes FALSE and vice verse, add `!` in the expression. it means "not".
4a: Transform all the colmns with milimetres measurements so they are scaled, and the prefix "sc_" to the columns names.
penguins %>% mutate(across(ends_with("mm"), scale, .names = "{.col}_sc")) %>% select(contains("mm"))
penguins %>% mutate(across(ends_with("mm"), scale, .names = "sc_{.col}")) %>% select(contains("mm"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Move the change "_sc" to "sc_" and move it before "{.col}".
Do the same, but now only for the bill measurements.
penguins %>% mutate(across(__("__"), scale, .names = "{.col}")) %>% select(__("__"))
penguins %>% mutate(across(starts_with("mm"), scale, .names = "sc_{.col}")) %>% select(starts_with("mm"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Change ends_with to starts_with
Do the same, but now for all the numeric columns.
penguins %>% mutate(across(__(is.__), scale, .names = "sc_{.col}")) %>% select(__(is.__))
penguins %>% mutate(across(where(is.numeric), scale, .names = "sc_{.col}")) %>% select(where(is.numeric))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Try remembering the `where` function and how it works
try `where(is.numeric)`
Adapt the code below, so that you de-mean the bill length column. Do it in two steps, first by making a column for the species means, then using that data to de-mean the bill length.
penguins %>% group_by(_) %>% mutate( bill_length_sp_mean = __(bill_length_mm, na.rm = TRUE), bill_length_cent = __ - __ ) %>% select(species, island, starts_with("bill"))
penguins %>% group_by(species) %>% mutate( bill_length_sp_mean = mean(bill_length_mm, na.rm = TRUE), bill_length_cent = bill_length_mm - bill_length_sp_mean ) %>% select(species, island, starts_with("bill"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Did you make sure the column names are correct?
Do the same, but now in a single step (i.e. do not store the species mean in its own column)
penguins %>% group_by(_) %>% mutate( bill_length_cent = ) %>% select(species, island, starts_with("bill"))
penguins %>% group_by(species) %>% mutate( bill_length_cent = bill_length_mm - mean(bill_length_mm, na.rm = TRUE) ) %>% select(species, island, starts_with("bill"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Did you make sure the column names are correct?
Based on the code in the previous example, adapt it to be grouped by island in stead of species.
penguins %>% group_by(island) %>% mutate( bill_length_sp_max = bill_length_pc = bill_length_mm/max(bill_length_mm, na.rm = TRUE ) %>% select(species, island, contains("bill"))
penguins %>% group_by(island) %>% mutate( bill_length_sp_max = max(bill_length_mm, na.rm = TRUE), bill_length_pc = (bill_length_mm/bill_length_sp_max)*100 ) %>% select(species, island, contains("bill"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Did you make sure the column names are correct?
De-mean all the numerical columns, and give them a suffix of "dm".
penguins %>% group_by(__) %>% mutate( across(__(__), ~ _ - mean(_, na.rm = TRUE), .names = "_") ) %>% select(species, where(is.numeric))
penguins %>% group_by(species) %>% mutate( across(where(is.numeric), ~ .x - mean(.x, na.rm = TRUE), .names = "{.col}_dm") ) %>% select(species, where(is.numeric))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Do you remember the `where` and `is.numeric` functions?
The across internal placeholder for column values is `.x`
The across internal placeholder for column names is `.col`
What is the difference in the minimum body mass for the penguins data sat if the data is grouped by species, and when it is ungrouped?
## 6a penguins %>% group_by(species) %>% mutate( body_mass_min = min(body_mass_g, na.rm = TRUE) )
## 6a penguins %>% group_by(species) %>% ungroup() %>% mutate( body_mass_min = min(body_mass_g, na.rm = TRUE) )
grade_code( correct = random_praise(), incorrect = random_encouragement() )
use `ungroup()` after the `group_by` to ungroup the data.
We will see more examples of how grouping and then ungrouping can give some real power when working with data.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.