Hierarchical data
In r4ds.tutorials: Tutorials for "R for Data Science"

library(learnr)
library(tutorial.helpers)
library(tidyverse)
library(tidyverse)
library(repurrrsive)
library(jsonlite)
knitr::opts_chunk$set(echo = FALSE)
knitr::opts_chunk$set(out.width = '90%')
options(tutorial.exercise.timelimit = 60, 
        tutorial.storage = "local") 
x1 <- list(1:4, "a", TRUE)
x2 <- list(a = 1:2, b = 1:3, c = 1:4)
x5 <- list(1, list(2, list(3, list(4, list(5)))))
df <- tibble(
  x = 1:2, 
  y = c("a", "b"),
  z = list(list(1, 2), list(3, 4, 5))
)
df1 <- tribble(
  ~x, ~y,
  1, list(a = 11, b = 12),
  2, list(a = 21, b = 22),
  3, list(a = 31, b = 32)
)
df2 <- tribble(
  ~x, ~y,
  1, list(11, 12, 13),
  2, list(21),
  3, list(31, 32)
)
df6 <- tribble(
  ~x, ~y,
  "a", list(1, 2),
  "b", list(3),
  "c", list()
)

gh_users2 <- read_json(gh_users_json())

json <- '[
  {"name": "John", "age": 34},
  {"name": "Susan", "age": 27}
]'
json1 <- '{
  "status": "OK", 
  "results": [
    {"name": "John", "age": 34},
    {"name": "Susan", "age": 27}
 ]
}
'
df3 <- tibble(json = parse_json(json))
df4 <- tibble(json1 = list(parse_json(json1)))
repos <- tibble(json = gh_repos)
chars <- tibble(json = got_chars)
locations <- gmaps_cities |> 
  unnest_wider(json) |> 
  select(-status) |> 
  unnest_longer(results) |> 
  unnest_wider(results)

Introduction

This tutorial covers Chapter 23: Hierarchical data from R for Data Science (2e) by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. You will learn how to work with non-rectanglar data using packages like jsonlite.

Lists

So far you’ve worked with data frames that contain simple vectors like integers, numbers, characters, date-times, and factors. These vectors are simple because they’re homogeneous: every element is of the same data type.If you want to store elements of different types in the same vector, you’ll need a list, which you create with list().

Exercise 1

To create our first list, let's create a variable x1 and assign the list() function to it. We will pass in three different data types within list(): the range of integers 1:4 to generate four integers, the string "a", and the Boolean value TRUE. On a new line, we'll call the variable x1.

x1 <- list(...,"a",...)
x1

When you run x1, you will get all the data types that were passed in, along with their corresponding indices. If you want to access a specific data type by its index, such as finding the element in the first index of x1, you can use the square brackets notation: x1[1]. This will only return the integers.

Exercise 2

It's indeed convenient to name the components, or children, of a list. To accomplish this, let's create a new variable x2 and assign the list() function to it. Within list(), we'll define three columns named a, b, and c. We'll assign 1:2 to a, 1:3 to b, and 1:4 to c. On a new line call x2.

... <- list(a = ..., b = ..., ... = 1:4)
x2

When you run x2, you will see that each column contains different values. To access individual columns of the list, you can use the $ operator since we have named the columns. For example, to access the column a from x2, you can use x2$a.

Exercise 3

Even for these very simple lists, printing takes up quite a lot of space. A useful alternative is str(), which generates a compact display of the structure, de-emphasizing the contents.

Run str() and pass in x1 and on a new line run str() again and pass in x2.

str(...)
str(...)

As you can see, str() displays each child of the list on its own line. It displays the name, if present, then an abbreviation of the type, then the first few values.

Exercise 4

Lists can contain any type of object, including other lists. This makes them suitable for representing hierarchical (tree-like) structures.

To create a representation of hierarchical structures, let's create a variable x3 and assign list() to it. Within list(), we will pass in list(1,2) and list(3,4). This means we will have two lists nested inside a single list. Then on a new line run str() and pass in x3.

... <- list(list(...), ...(3, 4))
str(...)

This is notably different to c(), which generates a flat vector like when we run c(c(1, 2), c(3, 4)), we get #> [1] 1 2 3 4 which is flat.

Exercise 5

Now what if we pass in two list in a vector, will we get a hierarchical vector or a flat vector?

To find out, create a new variable x4 and assign it to vector function c(). Within c() pass in list(1,2) and pass in list(3,4). On a new line, run str() and pass in x4.

... <- c(list(...), ...(3, ...))
str(...)

Even when we pass in lists, we get no hierarchical data and just get the numbers in a row-like order.

Exercise 6

As lists get more complex, str() gets more useful, as it lets you see the hierarchy at a glance.

Run x5 which is very complex list.

x5

When we ran x5 we get some many duplicate numbers in brackets and it's all confusing and complex. This is where str() comes to shine.

Exercise 7

Now run str(x5).

str(x5)

Now this dosen't mean in you should always use str() to visualize because as lists get even larger and more complex, str() eventually starts to fail, and you’ll need to switch to View().

Exercise 8

The image below show the result of calling view(x5).

knitr::include_graphics("images/View-1-min.png")

The RStudio view() lets you interactively explore a complex list. The viewer opens showing only the top level of the list.

Exercise 9

Clicking on the rightward facing triangle expands that component of the list so that you can also see its children like the image below.

knitr::include_graphics("images/View-2-min.png")

You can repeat this operation as many times as needed to get to the data you’re interested in. Note the bottom-left corner: if you click an element of the list, RStudio will give you the sub-setting code needed to access it, in this case x5[[2]][[2]][[2]]

Exercise 10

Like said previously, the image below is what it looks like if we repeat the operation.

knitr::include_graphics("images/View-3-min.png")

Now that's we have knowledge of hierarchical data, let's move on to list-columns.

Exercise 11

Lists can also live inside a tibble, where we call them list-columns. List-columns are useful because they allow you to place objects in a tibble that wouldn’t usually belong in there.

To demonstrate a simple example of list-column, create a variable df and assign tibble() to it. Within df we will have three columns: x, y, z. Pass in 1:2 for x, c("a","b") for y, and list(list(1,2),list(3,4,5)) for z. On a new line run df.

... <- tibble(
  ... = 1:2, 
  y = c("a", "b"),
 ... = list(list(1,...), ...(3, ..., 5))
)
df

In particular, list-columns are used a lot in the tidymodels ecosystem, because they allow you to store things like model outputs or resamples in a data frame.

Exercise 12

There’s nothing special about lists in a tibble; they behave like any other column. To show this, start a pipe with df to filter() and we want all values where x == 1 only.

df |> 
  filter(x == 1)

Computing with list-columns is harder, but that’s because computing with lists is harder in general.

Good work on finishing this section, next up, we'll be focusing on unnesting list-columns into regular variables.

Unnesting

Now that you’ve learned the basics of lists and list-columns, let’s explore how you can turn them back into regular rows and columns. Here we’ll use very simple sample data so you can get the basic idea; in the next section we’ll switch to real data.

Exercise 1

To generate simple sample data for this section, let's create a variable named df1 and assign tribble() to it. Within tribble(), we will define the columns ~x and ~y. Now, to arrange the data in a specific order, follow these steps:

Set the value as 1 for ~x and the list as list(a = 11, b = 12) for ~y. Repeat the above steps as needed for the rest. Set the value as 2 and the list as list(a = 21, b = 22). Set the value as 3 and the list as list(a = 31, b = 32).

... <- tribble(
  ~x, ~y,
  1, ...,
  ..., list(a = 21,...),
  3, list(..., b = 32),
)

List-columns tend to come in two basic forms: named and unnamed. When the children are named, they tend to have the same names in every row. For example, in df1, every element of list-column y has two elements named a and b. Named list-columns naturally unnest into columns: each named element becomes a new named column.

Exercise 2

Let's now create our second simple dataset. Copy the previous code and change the variable name to df2 and the other thing we will change is ~y. Remove all the old ones and insert the following values: list(11, 12, 13), list(21), and list(31, 32).

df2 <- tribble(
  ~x, ...,
  ..., list(..., 12, 13),
  2, list(...),
  ..., list(31, ...),
)

When the children are unnamed, the number of elements tends to vary from row-to-row. For example, in df2, the elements of list-column y are unnamed and vary in length from one to three. Now tidyr provides two functions for these two cases: unnest_wider() and unnest_longer(), which will help us explore the two sample data sets which we created.

Exercise 3

Let's explore what unnest_wider() does, so start a pipe with df1 to unnest_wider() and pass in the column y as the argument.

df1 |> 
  unnest_wider(...)

When each row has the same number of elements with the same names, like df1, it’s natural to put each component into its own column with unnest_wider().

Exercise 4

By default, the names of the new columns come exclusively from the names of the list elements, but you can use the names_sep argument to request that they combine the column name and the element name. Copy the previous code and within unnest_wider() add names_sep and set it equal to "_"

... |> 
  unnest_wider(y, ... = "_")

Compared to just passing in y in unnest_wider(), we can now see the column name (y) along with the element name(a or b). Now we will move into unnest_longer() which is more useful for data sample with different amount of data for record.

Exercise 5

Start a pipe with df2 to unnest_longer() and pass in y as the argument.

... |> 
  unnest_longer(y)

When each row contains an unnamed list, it’s most natural to put each element into its own row with unnest_longer() which is what happens in the code above. Also note how x is duplicated for each element inside of y: we get one row of output for each element inside the list-column.

Exercise 6

But what happens if one of the elements is empty? To see what happens, let's first create another sample of data, so create a new variable df6 and assign it to tribble(). Within tribble(), we will define the columns ~x and ~y. Now, to arrange the data in a specific order, follow these steps:

Set the value as "a" for ~x and the list as list(1, 2) for ~y. Repeat the above steps as needed for the rest. Set the value as "b" and the list as list(3). Set the value as "c" and the list as list().

... <- tribble(
  ~x, ~y,
  "...", list(..., 2),
  "b", list(3),
  ..., ...
)

As you can see we have a empty list for record "c", next up we use unnest_longer() to test it out.

Exercise 7

Start a pipe with df6 to unnest_longer() and pass in the y column as the argument for unnest_longer().

... |> 
  unnest_longer(y)

We get zero rows in the output for "c", so the row effectively disappears. If you want to preserve that row, adding NA in y, set keep_empty = TRUE.

Good work finishing unnesting section.

Case Studies

The main difference between the simple examples we used above and real data is that real data typically contains multiple levels of nesting that require multiple calls to unnest_longer() and/or unnest_wider(). To show that in action, this section works through three real rectangling challenges using datasets from the repurrrsive package.

Exercise 1

We’ll start with gh_repos. This is a list that contains data about a collection of GitHub repositories retrieved using the GitHub API. It’s a very deeply nested list so it’s difficult to show the structure in this book; we recommend exploring a little on your own with View(gh_repos) before we continue.

gh_repos is a list, but our tools work with list-columns, so we’ll begin by putting it into a tibble. We'll call this column json for reasons we’ll get to later. To do so, create a variable named repos and assign it to tibble(). Set the column name as json and assign gh_repos to json. On a new line call repos.

repos <- tibble(...= ...)
reposS

This tibble contains 6 rows, one row for each child of gh_repos. Each row contains a unnamed list with either 26 or 30 rows.

Exercise 2

Since the lists are unnamed, we’ll start with unnest_longer() to put each child in its own row: Start a pipe with repos to unnest_longer() and pass in json as the argument.

repos |>
  ...(json)

At first glance, it might seem like we haven’t improved the situation: while we have more rows (176 instead of 6) each element of json is still a list.

Exercise 3

There’s an important difference in the result of the code: now each element is a named list so we can use unnest_wider() to put each element into its own column: Continue the pipe from unnest_longer() in previous exercise to unnest_wider() and pass in json as the arguement.

... |>
  ...(json)|>
  unnest_wider(...)

This has worked but the result is a little overwhelming: there are so many columns that tibble doesn’t even print all of them!

Exercise 4

To print all the column names, continue the pipe from unnest_wider() to names(). To obtain only the first 10 names, further extend the pipe from names() to head() and provide the argument 10.

... |>
  ...()|>
  head(...)

We have so many columns but let’s pull out a few that look interesting.

Exercise 5

Delete the names() and head() functions and instead extend the pipe to select() where we will pass in : id, full_name, owner, description which are the columns we will explore.

... |> 
  unnest_longer(...) |> 
  ...(json) |> 
  select(..., full_name, ..., ...)

You can use this to work back to understand how gh_repos was structured: each child was a GitHub user containing a list of up to 30 GitHub repositories that they created.

Exercise 6

owner is another list-column, and since it contains a named list, we can use unnest_wider() to get at the values: Extend pipe from select() to unnest_wider() and pass in owner. Just as a reminder, it is possible to get errors so just move on.

... |> 
  unnest_longer(...) |> 
  ...(json) |> 
  select(..., full_name, ..., ...) |>
  unnest_wider(...)

Uh oh, this list column also contains an id column and we can’t have two id columns in the same data frame.

Exercise 7

As suggested by Rstudio, let's use names_sep to resolve the problem, so within unnest_wider() add names_sep = "_".

... |> 
  unnest_longer(...) |> 
  ...(json) |> 
  select(..., full_name, ..., ...) |>
  unnest_wider(..., ... = "_")

This gives another wide dataset, but you can get the sense that owner appears to contain a lot of additional data about the person who “owns” the repository. Next up, we will move onto relational data and unnesting it.

Exercise 8

Nested data is sometimes used to represent data that we’d usually spread across multiple data frames. For example, take got_chars which contains data about characters that appear in the Game of Thrones books and TV series. Like gh_repos it’s a list, so we start by turning it into a list-column of a tibble:

Create variable chars and assign it to tibble(). Within tibble(), set the column name to json and set json equal to got_chars. On a new line call chars.

... <- ...(json = ...)
chars

The json column contains named elements, so we’ll start by widening it.

Exercise 9

Start a pipe with chars to unnest_wider() and pass in json.

... |> 
  unnest_wider(...)

We have 18 columns and not all of them are releveant so let's only select few coloumns next up.

Exercise 10

Extend the pipe from unnest_wider() to select() and pass in the following columns to select: id, name, gender, culture, born, died, alive.

... |> 
  ..._wider(json) |> 
  select(..., name, ..., culture, born, ..., alive)

This dataset also contains many list-columns, so let's now select columns where there are records which include lists.

Exercise 11

Modify select() to select id as well as any list-columns with where(is.list).

... |> 
  ..._wider(json) |> 
  select(id, where(...))

Let’s explore the titles column. It’s an unnamed list-column, so we’ll unnest it into rows.

Exercise 12

Modify select() to only select id and titles. Then extend the pipe to unnest_longer() and pass in titles to unnest the titles.

... |> 
  ..._wider(json) |> 
  select(id, ...)|>
  unnest_...(titles)

You might expect to see this data in its own table because it would be easy to join to the characters data as needed. Let’s do that, which requires little cleaning.

Exercise 13

Let's make the data to be in its own table so assign the whole code we wrote to titles using <-. In addition, let's clean the data so extend the pipe to filter() and pass in titles != "" which filters out all the missing values and then extend the pipe the to rename() and pass in title = titles. On a new line call the whole table using titles.

... <- chars |> 
  unnest_wider(...) |> 
  select(..., titles) |> 
  ...(titles) |> 
  filter(titles != ...) |> 
  ...(title = ...)
titles

You could imagine creating a table like this for each of the list-columns, then using joins to combine them with the character data as you need it.

Good work unnesting relational data. Next up we will unnest deeply nested data.

Exercise 14

We’ll finish off these case studies with a list-column that’s very deeply nested and requires repeated rounds of unnest_wider() and unnest_longer() to unravel: gmaps_cities. This is a two column tibble containing five city names and the results of using Google’s geocoding API to determine their location:

Run gmaps_cities.

gmaps_cities

json is a list-column with internal names, so we start exploring json with unnest_wider().

Exercise 15

Start a pipe with gmaps_cities to unnest_wider() and pass in json.

gmaps_cities |>
  unnest_wider(...)

This gives us the status and the results. We’ll drop the status column since they’re all OK; in a real analysis, you’d also want to capture all the rows where status != "OK" and figure out what went wrong. results is an unnamed list, with either one or two elements (we’ll see why shortly).

Exercise 16

Extend the pipe to select() and pass in -status to remove the status column and then extend the pipe from select() to unnest_longer(). Within unnest_longer() pass in results.

... |> 
  ...(json) |> 
  select(-...) |> 
  unnest_longer(...)

Now results is a named list, so we’ll use unnest_wider().

Exercise 17

Extend the pipe to unnest_wider() and pass in results, assign the whole code to locations using <- and call locations on a new line.

locations <- ... |> 
  ...(json) |> 
  select(-...) |> 
  unnest_longer(...) |>
  unnest_wider(json)
locations

Now we can see why two cities got two results: Washington matched both Washington state and Washington, DC, and Arlington matched Arlington, Virginia and Arlington, Texas.

Exercise 18

There are a few different places we could go from here. We might want to determine the exact location of the match, which is stored in the geometry list-column. Start a new pipe with locations to select() and we will select the following: city, formatted_address, and geometry. Then we will extend the pipe to unnest_wider() and pass in geometry to extend the list.

... |> 
  ...(city, formatted_address, geometry) |> 
  unnest_wider(...)

After unnesting geometry it gives us new bounds (a rectangular region) and location (a point). We can unnest location to see the latitude (lat) and longitude (lng).

Exercise 19

Continue the pipe and extend it from unnest_wider() to unnest_wider() again and pass in location to unnest location.

... |> 
  ...(city, formatted_address, geometry) |> 
  unnest_wider(...)|>
  unnest_...(...)

Now we got the lat and lng of the location. Next we will extract the bounds.

Exercise 20

To unnest the bounds, we first need to select the columns which we need, so after unnest_wider(geometry), delete the unnest_wider(location) and instead continue the pipe with select() and pass in !location:viewport which excludes the columns from location to viewport. Then add the unnest_wider() with bounds as the arguement.

... |> 
  ...(city, formatted_address, geometry) |> 
  unnest_wider(...)|>
  select(!...:...)|>
  unnest_...(...)

Now we get two columns called northeast and southwest.

Exercise 21

We then rename southwest and northeast (the corners of the rectangle) so we can use names_sep to create short but evocative names when unnesting them.

Extend the pipe from unnest_wider(bounds) to rename()and pass in ne = northeast and sw = southwest. Then extend the pipe again to unnest_wider() and pass in a vector c(ne, sw) and also pass in names_sep = "_".

Note how we unnest two columns simultaneously by supplying a vector of variable names to unnest_wider().

Exercise 22

Now that you’ve discovered the path to get to the components you’re interested in, you can extract them directly using another tidyr function, hoist():

Start a new pipe with locations to select() and select the following: city, formatted_address and geometry, then continue the pipe from select() to hoist() and pass in the following: geometry, ne_lat = c("bounds", "northeast", "lat"), sw_lat = c("bounds", "southwest", "lat"), ne_lng = c("bounds", "northeast", "lng"), sw_lng = c("bounds", "southwest", "lng").

... |> 
  select(city, ..., geometry) |> 
  ...(
    geometry,
    ne_lat = c("bounds", "northeast", "lat"),
    sw_lat = c("bounds", "southwest", "..."),
    ne_lng = c("bounds", "northeast", "lng"),
    sw_lng = c("bounds", "southwest", "lng"),
  )

If these case studies have whetted your appetite for more real-life rectangling, you can see a few more examples in vignette("rectangling", package = "tidyr").

JSON

All of the case studies in the previous section were sourced from wild-caught JSON. JSON is short for javascript object notation and is the way that most web APIs return data. It’s important to understand it because while JSON and R’s data types are pretty similar, there isn’t a perfect 1-to-1 mapping, so it’s good to understand a bit about JSON if things go wrong.

Exercise 1

To convert JSON into R data structures, we recommend the jsonlite package, by Jeroen Ooms. We’ll use only two jsonlite functions: read_json() and parse_json(). The repurrsive package also provides the source for gh_user as a JSON file and you can read it with read_json().

Run gh_users_json() which gives you the path to the json file.

gh_users_json()

Next up we will read the json file.

Exercise 2

Create a new variable gh_user2 and assign it to the read_json() function and pass in the gh_users_json() function as argument.

gh_users2 <- read_json(...)

JSON is a simple format designed to be easily read and written by machines, not humans. It has six key data types. Four of them are scalars:

Null Type: plays the same role as NA in R.
String: much like a string in R, but must always use double quotes.
Number: similar to R’s numbers: they can use integer (e.g., 123), decimal (e.g., 123.45), or scientific (e.g., 1.23e3) notation. JSON doesn’t support Inf, -Inf, or NaN
Boolean: similar to R’s TRUE and FALSE, but uses lowercase true and false.

Exercise 3

Run identical() and pass in gh_users and gh_users2.

identical(gh_users, gh_users2)

In this course, we’ll also use parse_json(), since it takes a string containing JSON, which makes it good for generating simple examples.

Exercise 4

To get started, let's parse a string which contains the number 1. So run str() since we want a string output, then pass in parse_json() and within that pass in "1".

str(..._...("1"))

What will the output look like if we have multiple numbers in an array?

Exercise 5

Copy the previous code, and change the argument within parse_json() to a string which contains the array [1,2,3].

str(..._join("[...,...,...]"))

The way we get the output is like a list. Now what happens if we mapped a variable to the array?

Exercise 6

Copy the previous code, and within parse_json() change the arguement to a string which looks like {"x": [1,2,3]}.

...(parse_json('{"...": [1, 2, 3]}'))

JSON’s strings, numbers, and booleans are pretty similar to R’s character, numeric, and logical vectors. The main difference is that JSON’s scalars can only represent a single value. To represent multiple values you need to use one of the two remaining types: arrays and objects.

Exercise 7

In most cases, JSON files contain a single top-level array, because they’re designed to provide data about multiple “things”, e.g., multiple pages, or multiple records, or multiple results. In this case, you’ll start your rectangling with tibble(json) so that each element becomes a row.

Run json to have a quick look at the data before we dive in to parsing it and unnesting it.

json

jsonlite has another important function called fromJSON(). We don’t use it here because it performs automatic simplification (simplifyVector = TRUE). This often works well, particularly in simple cases, but we think you’re better off doing the rectangling yourself so you know exactly what’s happening and can more easily handle the most complicated nested structures.

Exercise 8

Now, let's transform the JSON to a list and store it within a tibble. Create a variable called df3 by using tibble(). Within the call to tibble(), we will have a json column where json is equal to parse_json(json).

df3 <- tibble(... = parse_json(...))

Yes, we have the data in tibble but we want to see the actual values instead of just seeing the amount of data in list being shown.

Exercise 9

To make the names and age visible, we will start a pipe with df3 to unnest_wider() and pass in json as the argument.

df3 |> 
  unnest_...(json)

In rarer cases, the JSON file consists of a single top-level JSON object, but what do we do to explore the data if we have many top level objects?

Exercise 10

Run json1 to explore the dataset.

json1

Compared to the json file we explored previously, we have two levels: status and results.

Exercise 11

Let's now parse the json1 and put it in a tibble. Create a variable df4 and assign it to tibble(). Within tibble(), set the column json1 equal to list(parse_json(json1)). On a new line call df4.

df4 <- ...(json ... list(parse_json(...)))

Both arrays and objects are similar to lists in R; the difference is whether or not they’re named. An array is like an unnamed list, and is written with []. For example [1, 2, 3] is an array containing 3 numbers, and [null, 1, "string", false] is an array that contains a null, a number, a string, and a boolean.

Exercise 12

Start a pipe with df4 to unnest_wider() and pass in json1 as the argument which will unnest the top level objects status and results.

... |>
  unnest_wider(...)

An object is like a named list, and is written with {}. The names (keys in JSON terminology) are strings, so must be surrounded by quotes. For example, {"x": 1, "y": 2} is an object that maps x to 1 and y to 2.

Exercise 13

Copy the previous code and continue the pipe from unnest_wider() to unnest_longer(). Within unnest_longer() pass in results which will unnest the columns name and age in results.

... |>
  unnest_wider(...) |>
  unnest_...(...)

If you want to explore more about jsonlite check out their website.

Exercise 14

Copy the previous code and continue the pipe from unnest_longer() to unnest_wider() and within it pass in results as the argument which will reveal the value within results.

... |>
  unnest_wider(...) |>
  unnest_...(...)|>
  ..._wider(...)

Alternatively, you can reach inside the parsed JSON and start with the bit that you actually care about.

Exercise 15

To do it faster, all we had to is the following:

df4 <- tibble(results = parse_json(json1)$results)
df4 |> 
  unnest_wider(results)

Good work on finishing the tutorial!

Summary

This tutorial covered Chapter 23: Hierarchical data from R for Data Science (2e) by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. You learned how to work with non-rectanglar data using packages like jsonlite.

Any scripts or data that you put into this service are public.

r4ds.tutorials documentation built on April 3, 2025, 5:50 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

r4ds.tutorials Tutorials for "R for Data Science"

Hierarchical data In r4ds.tutorials: Tutorials for "R for Data Science"

Introduction

Lists

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Exercise 10

Exercise 11

Exercise 12

Unnesting

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Case Studies

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Exercise 10

Exercise 11

Exercise 12

Exercise 13

Exercise 14

Exercise 15

Exercise 16

Exercise 17

Exercise 18

Exercise 19

Exercise 20

Exercise 21

Exercise 22

JSON

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Exercise 10

Exercise 11

Exercise 12

Exercise 13

Exercise 14

Exercise 15

Summary

Try the r4ds.tutorials package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

r4ds.tutorials
Tutorials for "R for Data Science"

Hierarchical data
In r4ds.tutorials: Tutorials for "R for Data Science"