Readme.md

lolplyr: dplyr extension for working with lists-of-lists

CRAN_Status_Badge Travis-CI Build Status Coverage Status

The idea behind lolplyr is to implement dplyr functions for lists-of-lists or lists-of-vectors. The lolplyr package is designed for lists-of-lists where each observation is a list element and each variable is an element of the nested list. In R lists-of-lists data structures can be encountered, for example, when working with data transformed from JSON format.

The lolplyr package implements function such as select, filter, mutate, rename, summarise and the joins implements in dplyr: left_join, right_join, full_join, semi_join, and anti_join. Additionally, if offers mutate_elem and transmute_elem functions as an element-wise analogs to mutate and transmute. The summarise function can be used to get summaries aggregated in terms of ad-hoc groups within the data using group_by beforehand. The package uses dplyr-like syntax, so to be intuitive for dplyr users.

The lolplyr package uses rlang, purrr and dplyr packages as a main back-end.

Installation

# install.packages("devtools")
devtools::install_github("twolodzko/lolplyr")

Usage

# transform data.frame to list-of-lists
mtc_lol <- as_lol(mtcars)

str(mtc_lol, list.len = 4)

## List of 32
##  $ Mazda RX4          :List of 11
##   ..$ mpg : num 21
##   ..$ cyl : num 6
##   ..$ disp: num 160
##   ..$ hp  : num 110
##   .. [list output truncated]
##  $ Mazda RX4 Wag      :List of 11
##   ..$ mpg : num 21
##   ..$ cyl : num 6
##   ..$ disp: num 160
##   ..$ hp  : num 110
##   .. [list output truncated]
##  $ Datsun 710         :List of 11
##   ..$ mpg : num 22.8
##   ..$ cyl : num 4
##   ..$ disp: num 108
##   ..$ hp  : num 93
##   .. [list output truncated]
##  $ Hornet 4 Drive     :List of 11
##   ..$ mpg : num 21.4
##   ..$ cyl : num 6
##   ..$ disp: num 258
##   ..$ hp  : num 110
##   .. [list output truncated]
##   [list output truncated]

mtc_lol %>%
  filter(cyl == 4, mpg > 20) %>%
  mutate(
    wt100 = wt * 100,
  ) %>%
  summarise(
    max_gear = max(gear),
    max_gear_div4 = max_gear / 4,
    mean_disp = mean(disp),
    mean(wt),
    mean(wt100),
    range_mpg = range(mpg)
  )

## $max_gear
## [1] 5
## 
## $max_gear_div4
## [1] 1.25
## 
## $mean_disp
## [1] 105.1364
## 
## $`mean(wt)`
## [1] 2.285727
## 
## $`mean(wt100)`
## [1] 228.5727
## 
## $range_mpg
## [1] 21.4 33.9

mtc_lol %>%
  group_by(gear) %>%
  summarise(
    mean(mpg),
    summary(cyl)
  )

## [[1]]
## [[1]]$gear
## [1] 4
## 
## [[1]]$`mean(mpg)`
## [1] 24.53333
## 
## [[1]]$`summary(cyl)`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.000   4.000   4.000   4.667   6.000   6.000 
## 
## 
## [[2]]
## [[2]]$gear
## [1] 3
## 
## [[2]]$`mean(mpg)`
## [1] 16.10667
## 
## [[2]]$`summary(cyl)`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.000   8.000   8.000   7.467   8.000   8.000 
## 
## 
## [[3]]
## [[3]]$gear
## [1] 5
## 
## [[3]]$`mean(mpg)`
## [1] 21.38
## 
## [[3]]$`summary(cyl)`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       4       4       6       6       8       8 



twolodzko/lolplyr documentation built on May 14, 2019, 8:22 a.m.