purrr <-> base R

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4.5,
  fig.align = "center"
)
options(tibble.print_min = 6, tibble.print_max = 6)

modern_r <- getRversion() >= "4.1.0"

Introduction

This vignette compares purrr's functionals to their base R equivalents, focusing primarily on the map family and related functions. This helps those familiar with base R understand better what purrr does, and shows purrr users how you might express the same ideas in base R code. We'll start with a rough overview of the major differences, give a rough translation guide, and then show a few examples.

library(purrr)
library(tibble)

Key differences

There are two primary differences between the base apply family and the purrr map family: purrr functions are named more consistently, and more fully explore the space of input and output variants.

Direct translations

The following sections give a high-level translation between base R commands and their purrr equivalents. See function documentation for the details.

Map functions

Here x denotes a vector and f denotes a function

| Output | Input | Base R | purrr | |------------------|------------------|------------------|-------------------| | List | 1 vector | lapply() | map() | | List | 2 vectors | mapply(), Map() | map2() | | List | >2 vectors | mapply(), Map() | pmap() | | Atomic vector of desired type | 1 vector | vapply() | map_lgl() (logical), map_int() (integer), map_dbl() (double), map_chr() (character), map_raw() (raw) | | Atomic vector of desired type | 2 vectors | mapply(), Map(), then is.*() to check type | map2_lgl() (logical), map2_int() (integer), map2_dbl() (double), map2_chr() (character), map2_raw() (raw) | | Atomic vector of desired type | >2 vectors | mapply(), Map(), then is.*() to check type | pmap_lgl() (logical), pmap_int() (integer), pmap_dbl() (double), pmap_chr() (character), pmap_raw() (raw) | | Side effect only | 1 vector | loops | walk() | | Side effect only | 2 vectors | loops | walk2() | | Side effect only | >2 vectors | loops | pwalk() | | Data frame (rbind outputs) | 1 vector | lapply() then rbind() | map_dfr() | | Data frame (rbind outputs) | 2 vectors | mapply()/Map() then rbind() | map2_dfr() | | Data frame (rbind outputs) | >2 vectors | mapply()/Map() then rbind() | pmap_dfr() | | Data frame (cbind outputs) | 1 vector | lapply() then cbind() | map_dfc() | | Data frame (cbind outputs) | 2 vectors | mapply()/Map() then cbind() | map2_dfc() | | Data frame (cbind outputs) | >2 vectors | mapply()/Map() then cbind() | pmap_dfc() | | Any | Vector and its names | l/s/vapply(X, function(x) f(x, names(x))) or mapply/Map(f, x, names(x)) | imap(), imap_*() (lgl, dbl, dfr, and etc. just like for map(), map2(), and pmap()) | | Any | Selected elements of the vector | l/s/vapply(X[index], FUN, ...) | map_if(), map_at() | | List | Recursively apply to list within list | rapply() | map_depth() | | List | List only | lapply() | lmap(), lmap_at(), lmap_if() |

Extractor shorthands

Since a common use case for map functions is list extracting components, purrr provides a handful of shortcut functions for various uses of [[.

| Input | base R | purrr | |-------------------|--------------------------|---------------------------| | Extract by name | lapply(x, `[[`, "a") | map(x, "a") | | Extract by position | lapply(x, `[[`, 3) | map(x, 3) | | Extract deeply | lapply(x, \(y) y[[1]][["x"]][[3]]) | map(x, list(1, "x", 3)) | | Extract with default value | lapply(x, function(y) tryCatch(y[[3]], error = function(e) NA)) | map(x, 3, .default = NA) |

Predicates

Here p, a predicate, denotes a function that returns TRUE or FALSE indicating whether an object fulfills a criterion, e.g. is.character().

| Description | base R | purrr | |-----------------------------|--------------------|-----------------------| | Find a matching element | Find(p, x) | detect(x, p), | | Find position of matching element | Position(p, x) | detect_index(x, p) | | Do all elements of a vector satisfy a predicate? | all(sapply(x, p)) | every(x, p) | | Does any elements of a vector satisfy a predicate? | any(sapply(x, p)) | some(x, p) | | Does a list contain an object? | any(sapply(x, identical, obj)) | has_element(x, obj) | | Keep elements that satisfy a predicate | x[sapply(x, p)] | keep(x, p) | | Discard elements that satisfy a predicate | x[!sapply(x, p)] | discard(x, p) | | Negate a predicate function | function(x) !p(x) | negate(p) |

Other vector transforms

| Description | base R | purrr | |-----------------------------|--------------------|-----------------------| | Accumulate intermediate results of a vector reduction | Reduce(f, x, accumulate = TRUE) | accumulate(x, f) | | Recursively combine two lists | c(X, Y), but more complicated to merge recursively | list_merge(), list_modify() | | Reduce a list to a single value by iteratively applying a binary function | Reduce(f, x) | reduce(x, f) |

Examples

Varying inputs

One input

Suppose we would like to generate a list of samples of 5 from normal distributions with different means:

means <- 1:4

There's little difference when generating the samples:

Two inputs

Lets make the example a little more complicated by also varying the standard deviations:

means <- 1:4
sds <- 1:4

Any number of inputs

We can make the challenge still more complex by also varying the number of samples:

ns <- 4:1

Outputs

Given the samples, imagine we want to compute their means. A mean is a single number, so we want the output to be a numeric vector rather than a list.

What if we want just the side effect, such as a plot or a file output, but not the returned values?

Pipes

You can join multiple steps together either using the magrittr pipe:

set.seed(2020)
means %>%
  map(rnorm, n = 5, sd = 1) %>%
  map_dbl(median)

Or the base pipe R:

set.seed(2020)
means |> 
  lapply(rnorm, n = 5, sd = 1) |> 
  sapply(median)

(And of course you can mix and match the piping style with either base R or purrr.)

The pipe is particularly compelling when working with longer transformations. For example, the following code splits mtcars up by cyl, fits a linear model, extracts the coefficients, and extracts the first one (the intercept).

mtcars %>% 
  split(mtcars$cyl) %>% 
  map(\(df) lm(mpg ~ wt, data = df)) %>% 
  map(coef) %>% 
  map_dbl(1)


Try the purrr package in your browser

Any scripts or data that you put into this service are public.

purrr documentation built on Aug. 10, 2023, 9:08 a.m.