library(tidyquintro) library(learnr) library(gradethis) knitr::opts_chunk$set(echo = FALSE, exercise.warn_invisible = FALSE) # enable code checking tutorial_options(exercise.checker = grade_learnr)
Create a column named bill_ld_ratio
that is the value of bill_length_mm
divided by bill_depth_mm
penguins |> mutate(_ = _ / _) |> select(species, island, contains("bill"))
penguins |> mutate(bill_ld_ratio = bill_length_mm / bill_depth_mm) |> select(species, island, contains("bill"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
This exercise expects piped data into the mutate function
Make sure you have given the new column the correct name
some times, we want to assign certain data values based on other variables in the data set. For instance, maybe we want to classify all penguins with body mass above 4.5 kg as "large" while the rest are "normal"?
The ifelse
function takes expressions much like filter
.
The first value after the expression is the value assigned if the expression is TRUE
, while the second is if the expression is FALSE
Adapt the code below to evaluate if body mass is above 4.5kg, and assign rows to either "large" or "normal"
penguins |> mutate(body_type = ifelse(body_mass_g _ 4500, "large", "normal")) |> select(species, island, contains("body"))
penguins |> mutate(body_type = ifelse(body_mass_g > 4500, "large", "normal")) |> select(species, island, contains("body"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Have you used the correct sign for 'larger than'?
Many times, we want to do the same as above, but with more than two options.
We can then use case_when
from dplyr.
This function is similar to ifelse
, but where you specify what each condition should be assigned.
On the left you have the logical expression, and the on the right of the tilde (~
) is the value to be assigned if that expression is TRUE
Adapt the below code so that penguins with body mass below 3 kg are "petite"
penguins |> mutate( body_type = case_when( body_mass_g _ 4500 ~ "large", body_mass_g _ 3000 ~ "petite", TRUE ~ "normal") # the rest ) |> select(species, island, contains("body"))
penguins |> mutate( body_type = case_when( body_mass_g > 4500 ~ "large", body_mass_g < 3000 ~ "petite", TRUE ~ "normal") # the rest ) |> select(species, island, contains("body"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Have you used the correct sign for 'larger than'?
Some times, it makes sense to calculate values based on some grouping variable. In this case, for instance species, island or sex. In other cases it might be other variables, like subject (for longitudinal designs) or treatment groups.
When data is grouped by one or more columns in the data, one can apply calculations based on summary measures for the groups on each individual score. This is powerful when you want to calculate which percentile a scores falls in, or other relational measures (like time since baseline).
Adapt the code below, so that you get what percentile a penguins' bill_length is based on the species maximum.
penguins |> group_by(_) |> mutate( bill_length_sp_max = max(__, na.rm = TRUE), bill_length_pc = (bill_length_mm/__)*100 ) |> select(species, island, contains("bill"))
penguins |> group_by(species) |> mutate( bill_length_sp_max = max(bill_length_mm, na.rm = TRUE), bill_length_pc = (bill_length_mm/bill_length_sp_max)*100 ) |> select(species, island, contains("bill"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Did you make sure the column names are correct?
It is possible, that the Islands have some impact on the penguins' size. Perhaps one island has more food available or less predators, so the penguins become larger.
Based on the code in the previous example, adapt it to be grouped by island in stead of species.
# Copy the code from the previous example, or type it out.
penguins |> group_by(island) |> mutate( bill_length_sp_max = max(bill_length_mm, na.rm = TRUE), bill_length_pc = (bill_length_mm/bill_length_sp_max)*100 ) |> select(species, island, contains("bill"))
grade_code( correct = random_praise(), incorrect = random_encouragement() )
Did you make sure the column names are correct?
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.