In jr-packages/jrTidyverse2: Jumping Rivers: Next Steps in the Tidyverse

First we must load the tidyverse

library("tidyverse")

Question 1 - Cocktails!

We've put together a small list containing the ingredients of some classic cocktails

data(cocktails, package = "jrTidyverse2")

a) How many cocktails are in the list? r str(cocktails)

b) Create a tibble called drinks, where one column contains the name of the cocktail, and the other column contains the vector of ingredients r drinks = tibble(cocktail = names(cocktails), ingredients = cocktails)

c) Create a new column that contains the number of ingredients in each column using mutate() and purrr r drinks = drinks %>% mutate(total_ingredients = map(ingredients, length))

d) We're off out! Tonight we're particularly thirsty for a cocktail with rum in it. Filter drinks such that it only has cocktails containing rum r drinks %>% mutate(contains_rum = map_lgl(ingredients, ~ any(.x == "rum"))) %>% filter(contains_rum)

Question 2 - Beer !

So, we're at the pub with 8 mates and it's your round. In total you've been tasked with ordering 4 ales, 3 ipas, 1 stout plus an ale for yourself! We can load a data set of all of the ale, ipa and stouts that the pub sells from the course package

data(beer_tidy, package = "jrTidyverse2")

We're going to randomly select each persons drink using purrr. If people had asked for an even number of ales, ipas and stouts we could have done this without purrr like so

beer_tidy %>% 
  group_by(Type) %>% 
  sample_n(3)

a) Nest the data according to the drink Type and save it as pub. r pub = beer_tidy %>% nest(-Type)

b) Create a column called n that contains the total number of each drink Type you need to order r pub = pub %>% mutate(n = c(5, 3, 1))

c) Create a new column called order that contains the randomly sampled drinks you are going to order. You should be using map2() to parallel map over the columns data and n. You should also be using sample_n() to perform the sampling. r pub = pub %>% mutate(order = map2(data, n, sample_n)) # or # mutate(order = map2(.x = data, .y = n, ~sample_n(.x, .y)))

d) To see the drinks, select only the Type and order column, then unnest() r pub %>% select(Type, order) %>% unnest(cols = order)

Question 3 - Happiness

You may remember the happiness data we used for practical 1 was recorded over 3 years; 2015, 2016 and 2017. For this question I've turned the happiness list in 3 tibbles, with each one representing the year. Running the following code will copy each file into your current working directory as a .csv file

library("jrTidyverse2")
get_happiness()

a) Using a combination of purrr and the unnest() function from tidyr, read in and combine the 3 data sets. Don't delete the column containing the file name! r fnames = list.files("happiness", recursive = TRUE, full.names = TRUE) happiness = tibble(fname = fnames) %>% mutate(data = map(fnames, read_csv)) %>% unnest() r file.remove(fnames) file.remove("happiness/")

b) The data within the csv files doesn't contain the year. Fortunately the file name does! Use the column containing the filename to create a column called Year. Have a look at the str_remove() or parse_number() functions from stringr and readr respectively r happiness = happiness %>% mutate(Year = parse_number(fname)) %>% select(-fname) # or # happiness = happiness %>% # mutate(Year = str_remove(fname, ".csv"), # Year = as.numeric(Year)) %>% # select(-fname)

c) Pick 3 countries and plot their happiness rank over time. r happiness %>% filter(Country %in% c("Denmark", "United Kingdom", "United States")) %>% ggplot(aes(x = Year, y = `Happiness Rank`, colour = Country)) + geom_line() + geom_point()

d) Every country in the data set has requested that they have their data sent to them individually. Make use of purrr's parallel mapping functions and nest() to write the data to .csv files r happiness_nest = happiness %>% nest(cols = -Country) map2(happiness_nest$Country, happiness_nest$data, ~write_csv(.y, path = paste0(.x, ".csv")))

jr-packages/jrTidyverse2 documentation built on Sept. 24, 2020, 4:36 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com