In Athanasiamo/tidyquintro: Quick Intro to Tidyverse

library(tidyquintro)
library(learnr)
library(gradethis)

knitr::opts_chunk$set(echo = FALSE,
                 exercise.warn_invisible = FALSE)

# enable code checking
tutorial_options(exercise.checker = grade_learnr)

penguins_long <- penguins |> 
  pivot_longer(starts_with("bill"),
               names_to = c("part", "measure" , "unit"),
               names_sep = "_")

Pivot longer

Pivoting data into a longer format is a handy skill. Many packages for analyses or visualisation require data to be shaped in a particular way. Learning to pivot data to the required shape is very important.

Start by pivoting the penguins data so that all the bill measurements (starts with "bill") are in the same column.

penguins |> 
  pivot_longer(_)

penguins |> 
  pivot_longer(starts_with("bill"))

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Have you selected the columns using `starts_with()`?

Do not worry about column naming, just get the measure into a column, and the measure names into another.

Altering names in one go

Alter the names in one go, so that there are three columns named "part", "measure" and "unit" after the pivot.

penguins |> 
  pivot_longer(_,
               names_to = c(_, _ , _),
               names_sep = _)

penguins |> 
  pivot_longer(starts_with("bill"),
               names_to = c("part", "measure" , "unit"),
               names_sep = "_")

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Have you selected the columns using `starts_with()`?

Do not worry about column naming, just get the measure into a column, and the measure names into another.

Removing rows where value == `NA`

when pivoting, it is common that quite some NA values appear in the values column. We can remove these immediately by making the argument values_drop_na be TRUE

penguins |> 
  pivot_longer(starts_with("bill"),
               names_to = c("part", "measure" , "unit"),
               names_sep = "_",
               values_drop_na = _)

penguins |> 
  pivot_longer(starts_with("bill"),
               names_to = c("part", "measure" , "unit"),
               names_sep = "_",
               values_drop_na = TRUE)

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

When setting something to TRUE or FALSE, this is case sensitive. They are all in capital letters

TRUE and FALSE are built in to R, they are not quoted

Pivoting experimentation

In this exercise, there is not specific task. Try different variations on pivoting, selecting different columns, altering the naming of the columns etc. See what the results are, or what the errors you get are.

penguins |> 
  pivot_longer()

Pivot wider

Some times, we get data that are either in too long shape, or we have done some operations on long data, with the idea to make them wider in a tidier format again. Also, particular analyses require wide data, so being able to get data in to a wide shape is essential.

For these exercises, you have a long format penguins dataset you can access, that has been made as follows:

penguins_long <- penguins |> 
  pivot_longer(starts_with("bill"),
               names_to = c("part", "measure" , "unit"),
               names_sep = "_")

Turn the penguins_long dataset back to its original state

penguins_long |> 
  pivot_wider(names_from = c(_, _, _), # pivot these columns
              values_from = _, # take the values from here
              names_sep = _) # separate names_from with this character

penguins_long |> 
  pivot_wider(names_from = c("body_part", "measure", "unit"), # pivot these columns
              values_from = "value", # take the values from here
              names_sep = "_") # separate names_from with this character

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Make sure you spell all column names correctly

Make sure you set the correct separator

Filling missing values

When pivoting from long to wide, there is a high likelihood that some cells will not have any value represented in the data. By default, these will be filled with NA (not applicable) values. You can decide what value to place here, though it is recommended to keep it at the default.

But let us try setting the values to something outrageous, like 10000!

penguins_long |> 
  pivot_wider(names_from = c(_, _, _), # pivot these columns
              values_from = _, # take the values from here
              names_sep = _) # separate names_from with this character

penguins_long |> 
  pivot_wider(names_from = c("body_part", "measure", "unit"), # pivot these columns
              values_from = "value", # take the values from here
              names_sep = "_", # separate names_from with this character
              values_fill = 10000)

grade_code(
  correct = random_praise(),
  incorrect = random_encouragement()
)

Make sure you add the correct number to the argument!

Pivoting experimentation

penguins_long |> 
  pivot_wider()

Athanasiamo/tidyquintro documentation built on Oct. 11, 2022, 7:15 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Athanasiamo/tidyquintro
Quick Intro to Tidyverse

In Athanasiamo/tidyquintro: Quick Intro to Tidyverse

Pivot longer

Altering names in one go

Removing rows where value == `NA`

Pivoting experimentation

Pivot wider

Filling missing values

Pivoting experimentation

R Package Documentation

Browse R Packages

We want your feedback!

Athanasiamo/tidyquintro Quick Intro to Tidyverse

In Athanasiamo/tidyquintro: Quick Intro to Tidyverse

Pivot longer

Altering names in one go

Removing rows where value == NA

Pivoting experimentation

Pivot wider

Filling missing values

Pivoting experimentation

R Package Documentation

Browse R Packages

We want your feedback!

Athanasiamo/tidyquintro
Quick Intro to Tidyverse

Removing rows where value == `NA`