A small package with list-tibble join operations.

Motivation

There no native strictly key-pair data structure to R. There is, however, a sufficently good first-approach which is a named list.

{
  ids <- paste0(sample(letters, size = 1000, replace = TRUE), 
                sample(1:30, 4, replace = TRUE))
  named_list <- purrr::map(1:1000, function(.x) rnorm(10))
  names(named_list) <- ids
  head(named_list)
}

And this approach can get you far-ish.

```{R messages = FALSE, warning = FALSE} library(tidyverse)

pluck(named_list, sample(ids, 1))

update specific values

ids_to_update <- sample(ids, 5)

modify_at(named_list, ids_to_update, function(x) 2*x - 3) -> updated_list

sample(named_list, 200) %>% # recover ids given condition keep(~ mean(.x) < -.5) %>% names()

And by that I mean this far.

```{R, error = TRUE}
(tibble(key = sample(ids, 100),
       val = runif(100)) ->
  keyval_tibble)

left_join(keyval_tibble)
left_join(keyval_tibble, named_list, by = character(), copy = TRUE)

lisjoin, as of now, provides a lazy prototypical approach:

```{R, error = TRUE, message = FALSE, warning = FALSE} library(lisjoin)

lisjoin(keyval_tibble, named_list, .key = key)

Right now `lisjoin` supports key-guessing, type stable, left/right/inner/full joins. For example, since we know the output is going to be a double-precision number:

```{R, error = TRUE, message = FALSE, warning = FALSE}
library(lisjoin)

map(named_list,
    ~ reduce(.x, sum)) %>% # reducing list
  lisjoin(keyval_tibble, 
          .,
          .key = key,
          type = 'dbl')

This feature is still under work, in the future expect it to work with length > 1 on the list's elements, with automatic unnesting.

Long term goals

This package is partly inspired primarely by clojure's map structure, purrr's filosophy and design. In the long term I see it as a tool allowing tidy data workflows on lists.



pedrocava/lisjoin documentation built on June 29, 2021, 6:22 p.m.