knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

dance

Lifecycle Status Travis build status

Dancing r emo::ji("woman_dancing") with the stats, aka tibble() dancing r emo::ji("man_dancing"). dance is a sort of reinvention of dplyr classic verbs, with a more modern stack underneath, i.e. it leverages a lot from vctrs and rlang.

Installation

You can install the development version from GitHub.

# install.packages("pak")
pak::pkg_install("romainfrancois/dance")

Usage

We'll illustrate tibble dancing with iris grouped by Species.

library(dance)
g <- iris %>% group_by(Species)

waltz(), polka(), tango(), charleston()

These are in the neighborhood of dplyr::summarise().

waltz() takes a grouped tibble and a list of formulas and returns a tibble with: as many columns as supplied formulas, one row per group. It does not prepend the grouping variables (see tango for that).

g %>% 
  waltz(
    Sepal.Length = ~mean(Sepal.Length), 
    Sepal.Width  = ~mean(Sepal.Width)
  )

polka() deals with peeling off one layer of grouping:

g %>% 
  polka()

tango() binds the results of polka() and waltz() so is the closest to dplyr::summarise()

g %>% 
  tango(
    Sepal.Length = ~mean(Sepal.Length), 
    Sepal.Width  = ~mean(Sepal.Width)
  )

charleston() is like tango but it packs the new columns in a tibble:

g %>% 
  charleston(
    Sepal.Length = ~mean(Sepal.Length), 
    Sepal.Width  = ~mean(Sepal.Width)
  )

swing, twist

There is no waltz_at(), tango_at(), etc ... but instead we can use either the same function on a set of columns or a set of functions on the same column.

For this, we need to learn new dance moves:

swing() and twist() are for applying the same function to a set of columns:

library(tidyselect)

g %>% 
  tango(swing(mean, starts_with("Petal")))

g %>% 
  tango(data = twist(mean, starts_with("Petal")))

They differ in the type of column is created and how to name them:

g %>% 
  tango(
    swing(mean, starts_with("Petal"), .name = "mean_{var}"), 
    swing(median, starts_with("Petal"), .name = "median_{var}"), 
  )
g %>% 
  tango(
    mean   = twist(mean, starts_with("Petal")), 
    median = twist(median, starts_with("Petal")), 
  )

The first arguments of swing() and twist() are either a function or a formula that uses . as a placeholder. Subsequent arguments are tidyselect selections.

You can combine swing() and twist() in the same tango() or waltz():

g %>% 
  tango(
    swing(mean, starts_with("Petal"), .name = "mean_{var}"), 
    median = twist(median, contains("."))
  )

rumba, zumba

Similarly rumba() can be used to apply several functions to a single column. rumba() creates single columns and zumba() packs them into a data frame column.

g %>% 
  tango(
    rumba(Sepal.Width, mean = mean, median = median, .name = "Sepal_{fun}"), 
    Petal = zumba(Petal.Width, mean = mean, median = median)
  )

salsa, chacha, samba, madison

Now we enter the realms of dplyr::mutate() with:

g %>% 
  salsa(
    Sepal = ~Sepal.Length * Sepal.Width, 
    Petal = ~Petal.Length * Petal.Width
  )

You can swing(), twist(), rumba() and zumba() here too, and if you want the original data, you can use samba() instead of salsa():

g %>% 
  samba(centered = twist(~ . - mean(.), everything(), -Species))

madison() packs the columns salsa() would have created

g %>% 
  madison(swing(~ . - mean(.), starts_with("Sepal")))

bolero and mambo

bolero() is similar to dplyr::filter(). The formulas may be made by mambo() if you want to apply the same predicate to a tidyselection of columns:

g %>% 
  bolero(~Sepal.Width > 4)

g %>% 
  bolero(mambo(~. > 4, starts_with("Sepal")))

g %>% 
  bolero(mambo(~. > 4, starts_with("Sepal"), .op = or))


romainfrancois/dance documentation built on Nov. 21, 2019, 11:49 a.m.