In Athanasiamo/swc.tidyverse: Introduction to Tidyverse

library(learnr)
library(gradethis)

knitr::opts_chunk$set(
  echo = FALSE,
  exercise.warn_invisible = FALSE
)

# enable code checking
tutorial_options(
  exercise.checker = grade_learnr,
  exercise.lines = 8,
  exercise.reveal_solution = TRUE
)

Challenge 1

1a

Create a column named bill_ld_ratio that is the value of bill_length_mm divided by bill_depth_mm

penguins %>% 
  mutate(_ = _ / _) %>% 
  select(species, island, contains("bill"))

penguins %>% 
  mutate(bill_ld_ratio = bill_length_mm / bill_depth_mm) %>% 
  select(species, island, contains("bill"))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

This exercise expects piped data into the mutate function

Make sure you have given the new column the correct name

1b

transform the body mass column to the logarithmic scale using the log function

penguins %>% 
  mutate(body_mass_g_log = __(body_mass_g))

penguins %>% 
  mutate(body_mass_g_log = log(body_mass_g))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

use the `log()` function around body mass.

1c

Divide all values in the flipper lenght column with 10, and store it in a variable called flipper_length_cm.

penguins %>% 
  mutate(_  = _/10)

penguins %>% 
  mutate(flipper_length_cm  = flipper_length_mm/10)

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

use the `log()` function around body mass.

Challenge 2

2a

Adapt the code below to evaluate if body mass is below 4.5kg, and assign rows that are TRUE to be "normal" and rows that are FALSE to "large"

penguins %>% 
  mutate(body_type = ifelse(body_mass_g _ 4500, "_", "_")) %>% 
  select(species, island, contains("body"))

penguins %>% 
  mutate(body_type = ifelse(body_mass_g < 4500, "normal", "large")) %>% 
  select(species, island, contains("body"))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

After the logical expressions in ifelse, the first value should be the `TRUE` value, and the second the `FALSE`

2b

penguins %>% 
  mutate(biscoe = ifelse(island __ "Biscoe", __, __)) %>% 
  select(species, island, biscoe)

penguins %>% 
  mutate(biscoe = ifelse(island == "Biscoe", TRUE, FALSE)) %>% 
  select(species, island, biscoe)

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Have you used the correct sign for equals `==`

in ifelse, after the logical expressions the first value should be the `TRUE` value, and the second the `FALSE`.

Challenge 3

3a

Adapt the below code so that penguins with body mass below 3 kg are "petite"

penguins %>% 
  mutate(
    body_type = case_when(
      body_mass_g _ 4500 ~ "large",
      body_mass_g _ 3000 ~ "petite",
      _ ~ "normal") # the rest
  ) %>% 
  select(species, island, contains("body"))

penguins %>% 
  mutate(
    body_type = case_when(
      body_mass_g > 4500 ~ "large",
      body_mass_g < 3000 ~ "petite",
      !is.na(body_mass_g) ~ "normal") # the rest
  ) %>% 
  select(species, island, contains("body"))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Have you used the correct sign for 'larger than'?

Have you used the correct sign for 'smaller than'?

evaluate if something is NA with the `is.na` function.

To flip a logical, so that TRUE becomes FALSE and vice verse, add `!` in the expression. it means "not".

Challenge 4

4a: Transform all the colmns with milimetres measurements so they are scaled, and the prefix "sc_" to the columns names.

penguins %>% 
  mutate(across(ends_with("mm"),
                scale, 
                .names = "{.col}_sc")) %>% 
  select(contains("mm"))

penguins %>% 
  mutate(across(ends_with("mm"),
                scale, 
                .names = "sc_{.col}")) %>% 
  select(contains("mm"))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Move the change "_sc" to "sc_" and move it before "{.col}".

4b

Do the same, but now only for the bill measurements.

penguins %>% 
  mutate(across(__("__"),
                scale, 
                .names = "{.col}")) %>% 
  select(__("__"))

penguins %>% 
  mutate(across(starts_with("mm"),
                scale, 
                .names = "sc_{.col}")) %>% 
  select(starts_with("mm"))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Change ends_with to starts_with

4c

Do the same, but now for all the numeric columns.

penguins %>% 
  mutate(across(__(is.__),
                scale, 
                .names = "sc_{.col}")) %>% 
  select(__(is.__))

penguins %>% 
  mutate(across(where(is.numeric),
                scale, 
                .names = "sc_{.col}")) %>% 
  select(where(is.numeric))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Try remembering the `where` function and how it works

try `where(is.numeric)`

Challenge 5

5a

Adapt the code below, so that you de-mean the bill length column. Do it in two steps, first by making a column for the species means, then using that data to de-mean the bill length.

penguins %>% 
  group_by(_) %>% 
  mutate(
    bill_length_sp_mean = __(bill_length_mm, na.rm = TRUE),
    bill_length_cent = __ - __
  ) %>% 
  select(species, island, starts_with("bill"))

penguins %>% 
  group_by(species) %>% 
  mutate(
    bill_length_sp_mean = mean(bill_length_mm, na.rm = TRUE),
    bill_length_cent = bill_length_mm - bill_length_sp_mean
  ) %>% 
  select(species, island, starts_with("bill"))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Did you make sure the column names are correct?

5b

Do the same, but now in a single step (i.e. do not store the species mean in its own column)

penguins %>% 
  group_by(_) %>% 
  mutate(
    bill_length_cent = 
  ) %>% 
  select(species, island, starts_with("bill"))

penguins %>% 
  group_by(species) %>% 
  mutate(
    bill_length_cent = bill_length_mm - mean(bill_length_mm, na.rm = TRUE)
  ) %>% 
  select(species, island, starts_with("bill"))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Did you make sure the column names are correct?

5c

Based on the code in the previous example, adapt it to be grouped by island in stead of species.

penguins %>% 
  group_by(island) %>% 
  mutate(
    bill_length_sp_max = 
    bill_length_pc = bill_length_mm/max(bill_length_mm, na.rm = TRUE
  ) %>% 
  select(species, island, contains("bill"))

penguins %>% 
  group_by(island) %>% 
  mutate(
    bill_length_sp_max = max(bill_length_mm, na.rm = TRUE),
    bill_length_pc = (bill_length_mm/bill_length_sp_max)*100
  ) %>% 
  select(species, island, contains("bill"))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Did you make sure the column names are correct?

5d

De-mean all the numerical columns, and give them a suffix of "dm".

penguins %>% 
  group_by(__) %>% 
  mutate(
    across(__(__),
           ~ _ - mean(_, na.rm = TRUE),
           .names = "_")
  ) %>% 
  select(species, where(is.numeric))

penguins %>% 
  group_by(species) %>% 
  mutate(
    across(where(is.numeric),
           ~ .x - mean(.x, na.rm = TRUE),
           .names = "{.col}_dm")
  ) %>% 
  select(species, where(is.numeric))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Do you remember the `where` and `is.numeric` functions?

The across internal placeholder for column values is `.x`

The across internal placeholder for column names is `.col`

Challenge 6

6a

What is the difference in the minimum body mass for the penguins data sat if the data is grouped by species, and when it is ungrouped?

## 6a
penguins %>% 
  group_by(species) %>% 

  mutate(
    body_mass_min = min(body_mass_g, na.rm = TRUE)
  )

## 6a
penguins %>% 
  group_by(species) %>% 
  ungroup() %>% 
  mutate(
    body_mass_min = min(body_mass_g, na.rm = TRUE)
  )

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

use `ungroup()` after the `group_by` to ungroup the data.

We will see more examples of how grouping and then ungrouping can give some real power when working with data.

Athanasiamo/swc.tidyverse documentation built on Dec. 17, 2021, 9:48 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Athanasiamo/swc.tidyverse
Introduction to Tidyverse

In Athanasiamo/swc.tidyverse: Introduction to Tidyverse

Challenge 1

1a

1b

1c

Challenge 2

2a

2b

Challenge 3

3a

Challenge 4

4b

4c

Challenge 5

5a

5b

5c

5d

Challenge 6

6a

R Package Documentation

Browse R Packages

We want your feedback!

Athanasiamo/swc.tidyverse Introduction to Tidyverse

In Athanasiamo/swc.tidyverse: Introduction to Tidyverse

Challenge 1

1a

1b

1c

Challenge 2

2a

2b

Challenge 3

3a

Challenge 4

4b

4c

Challenge 5

5a

5b

5c

5d

Challenge 6

6a

R Package Documentation

Browse R Packages

We want your feedback!

Athanasiamo/swc.tidyverse
Introduction to Tidyverse