over2: Apply functions to two vectors simultaniously in 'dplyr'

Description Usage Arguments Value Examples

View source: R/over2.R

Description

over2() and over2x() are variants of over() that iterate over two objects simultaneously. over2() loops each pair of elements in .x and .y over one or more functions, while over2x() loops all pairwise combinations between elements in .x a .y over one or more functions.

Usage

1
2
3
over2(.x, .y, .fns, ..., .names = NULL, .names_fn = NULL)

over2x(.x, .y, .fns, ..., .names = NULL, .names_fn = NULL)

Arguments

.x, .y

An atomic vector or list to apply functions to. Alternatively a <selection helper> can be used to create a vector. over2() requires .x and .y to be of the same length.

.fns

Functions to apply to each of the elements in .x and .y. .

Possible values are:

  • A function

  • A purrr-style lambda

  • A list of functions/lambdas

For examples see the example section below.

Note that NULL is not accepted as argument to .fns.

...

Additional arguments for the function calls in .fns.

.names

A glue specification that describes how to name the output columns. This can use {x} and {y} to stand for the selected vector element, and {fn} to stand for the name of the function being applied. The default (NULL) is equivalent to "{x}_{y}" for the single function case and "{x}_{y}_{fn}" for the case where a list is used for .fns.

Note that, depending on the nature of the underlying object in .x and .y, specifying {x}/{y} will yield different results:

  • If .x/.y is an unnamed atomic vector, {x}/{y} will represent each value.

  • If .x/.y is a named list or atomic vector, {x}/{y} will represent each name.

  • If .x/.y is an unnamed list, {x}/{y} will be the index number running from 1 to length(x) or length(y) respectively.

This standard behavior (interpretation of {x}/{y}) can be overwritten by directly specifying:

  • {x_val} or {y_val} for .x's or .y's values

  • {x_nm} or {y_nm} for their names

  • {x_idx} or {y_idx} for their index numbers

Alternatively, a character vector of length equal to the number of columns to be created can be supplied to .names. Note that in this case, the glue specification described above is not supported.

.names_fn

Optionally, a function that is applied after the glue specification in .names has been evaluated. This is, for example, helpful in case the resulting names need to be further cleaned or trimmed.

Value

over2() returns a tibble with one column for each pair of elements in .x and .y combined with each function in .fns.

over2x() returns a tibble with one column for each combination between elements in .x and .y combined with each function in .fns.

Examples

For the basic functionality please refer to the examples in over().

library(dplyr)

# For better printing
iris <- as_tibble(iris)

When doing exploratory analysis, it is often helpful to transform continious variables into several categorial variables. Below we use over2() to loop over two lists containing "breaks" and "labels" arguments, which we then use in a call to cut():

brks <- list(b1 = 3:8,
             b2 = seq(3, 9, by = 2))

labs <- list(l1 = c("3 to 4", "4 to 5", "5 to 6",
                   "6 to 7", "7 to 8"),
            l2 = c("3 to 5", "5 to 7", "7 to 9"))

iris %>%
  transmute(over2(brks, labs,
                  ~ cut(Sepal.Length,
                        breaks = .x,
                        labels = .y),
                  .names = "Sepal.Length.cut{x_idx}"))
#> # A tibble: 150 x 2
#>   Sepal.Length.cut1 Sepal.Length.cut2
#>   <fct>             <fct>            
#> 1 5 to 6            5 to 7           
#> 2 4 to 5            3 to 5           
#> 3 4 to 5            3 to 5           
#> 4 4 to 5            3 to 5           
#> # ... with 146 more rows

over2x() makes it possible to create dummy variables for interaction effects of two variables. In the example below, each customer 'type' is combined with each 'product' type:

csat %>%
  transmute(over2x(unique(type),
                   unique(product),
                   ~ type == .x & product == .y)) %>%
  glimpse
#> Rows: 150
#> Columns: 9
#> $ existing_advanced   <lgl> TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FA~
#> $ existing_premium    <lgl> FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, ~
#> $ existing_basic      <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,~
#> $ reactivate_advanced <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, ~
#> $ reactivate_premium  <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,~
#> $ reactivate_basic    <lgl> FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, ~
#> $ new_advanced        <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,~
#> $ new_premium         <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,~
#> $ new_basic           <lgl> FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, F~

TimTeaFan/dplyover documentation built on Sept. 27, 2021, 3:14 p.m.