unnest: Unnest lists

View source: R/unnest.R

unnestR Documentation

Unnest lists

Description

Unnest nested lists into a flat data.frames.

Usage

unnest(
  x,
  spec = NULL,
  dedupe = FALSE,
  stack_atomic = NULL,
  process_atomic = NULL,
  process_unnamed_lists = NULL,
  cross_join = TRUE
)

Arguments

x

a nested list to unnest

spec

spec to use for unnesting. See spec().

dedupe

whether to dedupe repeated elements. If TRUE, if a node is visited for a second time and is not explicitly declared in the spec the node is skipped. This is particularly useful with grouped specs.

stack_atomic

Whether atomic leaf vectors should be stacked or not. If NULL, the default, data.frame vectors are stacked, all others are spread.

process_atomic

Process spec for atomic leaf vectors. Either NULL for no processing (the default), "as_is" to return the entire element in a list column, "paste" to paste elements together into a character column.

process_unnamed_lists

How to process unnamed lists. Can be one of "as_is" - return a list column, "exclude" - drop these elements unless they are explicitly included in the spec, "paste" - return a character column, "stack" - automatically stack. If NULL (the default), do nothing - process them normally according to the specs.

cross_join

Specifies how the results from sibling nodes are joined (cbinded) together. The shorter data.frames (fewer rows) can be either recycled to the max number of rows across all joined components with cross_join = FALSE. Or, the results are cross joined (produce all combinations of rows across all components) with cross_join = TRUE. cross_join = TRUE is the default because of no data loss and it is more conducive for earlier error detection with incorrect specs

Value

A data.frame, data.table or a tibble as specified by the option unnest.return.type. Defaults to data.frame.

Examples


x <- list(a = list(b = list(x = 1, y = 1:2, z = 10),
                   c = list(x = 2, y = 100:102)))
xxx <- list(x, x, x)

## spreading
unnest(x, s("a"))
unnest(x, s("a"), stack_atomic = TRUE)
unnest(x, s("a/b"), stack_atomic = TRUE)
unnest(x, s("a/c"), stack_atomic = TRUE)
unnest(x, s("a"), stack_atomic = TRUE, cross_join = TRUE)
unnest(x, s("a//x"))
unnest(x, s("a//x,z"))
unnest(x, s("a/2/x,y"))

## stacking
unnest(x, s("a/", stack = TRUE))
unnest(x, s("a/", stack = TRUE, as = "A"))
unnest(x, s("a/", stack = TRUE, as = "A"), stack_atomic = TRUE)
unnest(x, s("a/", stack = "id"), stack_atomic = TRUE)
unnest(x, s("a/", stack = "id", as = ""), stack_atomic = TRUE)

unnest(xxx, s(stack = "id"))
unnest(xxx, s(stack = "id"), stack_atomic = TRUE)
unnest(xxx, s(stack = "id", s("a/b/y/", stack = TRUE)))

## exclusion
unnest(x, s("a/b/", exclude = "x"))

## dedupe
unnest(x, s("a", s("b/y"), s("b")), stack_atomic = TRUE)
unnest(x, s("a", s("b/y"), s("b")), dedupe = TRUE, stack_atomic = TRUE)

## grouping
unnest(xxx, stack_atomic = TRUE,
       s(stack = TRUE,
         groups = list(first = s("a/b/x,y"),
                       second = s("a/b"))))

unnest(xxx, stack_atomic = TRUE, dedupe = TRUE,
       s(stack = TRUE,
         groups = list(first = s("a/b/x,y"),
                       second = s("a/b"))))

## processing as_is
str(unnest(xxx, s(stack = "id",
                  s("a/b/y", process = "as_is"),
                  s("a/c", process = "as_is"))))
str(unnest(xxx, s(stack = "id", s("a/b/", process = "as_is"))))
str(unnest(xxx, s(stack = "id", s("a/b", process = "as_is"))))

## processing paste
str(unnest(x, s("a/b/y", process = "paste")))
str(unnest(xxx, s(stack = TRUE, s("a/b/", process = "paste"))))
str(unnest(xxx, s(stack = TRUE, s("a/b", process = "paste"))))

## default
unnest(x, s("a/b/c/", s("b", default = 100)))
unnest(x, s("a/b/c/", stack = "ix", s("b", default = 100)))


unnest documentation built on Jan. 9, 2023, 1:25 a.m.