"Nesting" a data frame describes the process of combining values to create a single list-column variable of data frames within a data frame. The tidyr
package makes this process easy with the nest()
function. See an example here: http://r4ds.had.co.nz/many-models.html.
Nesting in tidyr
relies on grouping functions from the dplyr
package to index the rows before nesting. However, as of September of 2018, dplyr
is still unable to group by list columns (see https://github.com/tidyverse/tidyr/issues/249). In other words, you cannot nest variables with a list column included in your nest. This also means you cannot create multiple nests, because the first nest would create a list column.
The nestyr
package gets around this problem with the nestyr::nest2()
function. Instead of grouping by list columns directly, it finds an MD5 hash for all elements of any list columns, and uses the hash to find the group indices.
Additionally, the nestyr::nest_cols()
function gets around the need to include a row id to nest columns together while preserving rows. In other words, you can nest horizontally without thinking about row grouping. Like nest2
, nest_cols
also works with pre-existing list columns.
Suppose we take the example provided by the "R for Data Science" book by Garret Germund and Hadley Wickham (see https://github.com/tidyverse/tidyr/issues/249). In this instance, however, let's say we wanted to nest together the columns country
and continent
together, and then nest together the remaining columns by unique country-continent.
library(gapminder)
library(tidyverse)
library(nestyr)
gapminder %>%
nest_cols(country, continent, .key = "country_data") %>%
nest2(-country_data)
#> # A tibble: 142 x 2
#> country_data data
#> <list> <list>
#> 1 <tibble [1 x 2]> <tibble [12 x 4]>
#> 2 <tibble [1 x 2]> <tibble [12 x 4]>
#> 3 <tibble [1 x 2]> <tibble [12 x 4]>
#> 4 <tibble [1 x 2]> <tibble [12 x 4]>
#> 5 <tibble [1 x 2]> <tibble [12 x 4]>
#> 6 <tibble [1 x 2]> <tibble [12 x 4]>
#> 7 <tibble [1 x 2]> <tibble [12 x 4]>
#> 8 <tibble [1 x 2]> <tibble [12 x 4]>
#> 9 <tibble [1 x 2]> <tibble [12 x 4]>
#> 10 <tibble [1 x 2]> <tibble [12 x 4]>
#> # ... with 132 more rows
Use devtools
to install from github. This package is not yet available on CRAN.
library(devtools)
install_github("JakeNel28/nestyr")
The package relies on much of the source code and some documentation used in tidyr::nest()
, which was developed by Hadley Wickham "hadley@rstudio.com", Lionel Henry "lionel@rstudio.com", and RStudio https://www.rstudio.com/. I am very grateful to them for developing tidyr
!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.