In EvaMaeRey/tidypivot: Tidy table calculations with visual api

class: inverse, left, bottom background-image: url(https://images.unsplash.com/photo-1543286386-713bdd548da4?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1470&q=80) background-size: cover

.Large[tidypivot]

.small[Some ideas]

.tiny[Dr. Evangeline Reynolds | 2022-04-26 |Image credit: William Iven, Upsplash]

??? Title slide

# This is the recommended set up for flipbooks
# you might think about setting cache to TRUE as you gain practice --- building flipbooks from scratch can be time consuming
knitr::opts_chunk$set(fig.width = 6, message = FALSE, warning = FALSE, comment = "", cache = F)
library(flipbookr)
library(tidyverse)

```{css, eval = TRUE, echo = FALSE} .remark-code{line-height: 1.5; font-size: 100%}

@media print { .has-continuation { display: block; } }

code.r.hljs.remark-code{ position: relative; overflow-x: hidden; }

code.r.hljs.remark-code:hover{ overflow-x:visible; width: 500px; border-style: solid; }

# Step 00. prep some data, records and flat data frame

```r
library(tidyverse)
library(magrittr)

Titanic %>% 
  data.frame() %>% 
  uncount(weights = Freq) ->
tidy_titanic ; tidy_titanic %>% head()

Titanic %>% 
  data.frame() ->
flat_titanic ; flat_titanic %>% head()

Step 0. Some observations:

ggplot2: user needs to describe layout of graphic...

???

Lots of data science is counting things...

Less experience in this world

But a graphic may actually look a lot like a table!

horizontal position variables are both categorical...

you can make a visual pivot table in ggplot2; analyst job is to describe the form. How will it look

specify 3 things - start with visual layout

specify x
specify y
specify count type geom

tidy_titanic %>% 
  ggplot() + 
  aes(x = Sex, y = Survived) + 
  geom_jitter() + 
  geom_count(color = "blue")

But sometimes you may actually not need a graphic, but rather the raw numbers.

What does getting there look like using the tidyverse tools?

tidy_titanic %>% 
  group_by(Sex, Survived) %>% 
  summarize(count = n()) %>% 
  pivot_wider(names_from = Survived, 
              values_from = count)

With existing pivot tools, description isn't so visual

specify vars
specify aggregation
specify visual arrangement (names from?) <- this comes last

can we make a more visual, declarative API?

Step 1a. Make Functions to allow description of final table, pivot_count and pivot_calc

x argument is horizontal elements (columns) and y is vertical elements (rows)

pivot_count_script <- readLines("./R/pivot_count.R")

pivot_calc_script <- readLines("./R/pivot_calc.R")

Step 1b. Using those functions

# rows and cols
tidy_titanic %>% 
  pivot_count(rows = Survived, cols = Sex) 

# cols only
tidy_titanic %>% 
   pivot_count(cols = Sex)

# rows only
tidy_titanic %>% 
  pivot_count(rows = Survived) 

# two rows and col
tidy_titanic %>% 
  pivot_count(rows = c(Survived, Class), cols = Sex)

# two rows and col and contains zero counts
tidy_titanic %>% 
  pivot_count(rows = c(Survived, Class), cols = c(Sex, Age))

# two rows and col and contains zero counts
tidy_titanic %>% 
  pivot_count(rows = c(Survived, Class), cols = c(Sex, Age), pivot = F)

# count all
tidy_titanic %>% 
   pivot_count()

# for fun organize like it will appear visually in code
tidy_titanic %>% 
  pivot_count(                          cols = Sex, 
              rows = c(Survived, Class)        )

After examining your table you might actually want to have the calculation in long form (for use in something like ggplot2). This is what pivot = F is for!

tidy_titanic %>% 
  pivot_count(cols = Sex, rows = Survived, pivot = F)

1b. pivot_calc using pivot calc function for non count aggregation

just for fun arrange the code how the table will look

flat_titanic %>%
  pivot_calc(              cols = Sex, 
             rows = Survived, value = Freq, fun = sum)

flat_titanic %>% 
  pivot_count(cols = Sex, 
             rows = Survived, wt = Freq)

Issue: For this case, we should probably use pivot_count and allow for a wt argument.

1b style. use another tool to style

goal of functions is not to style - just to make calculation faster by using a visually driven API

tidy_titanic %>%  
  pivot_count(cols = Sex, rows = c(Survived, Class)) %>% 
  group_by(Class) %>% 
  gt::gt()

tidy_titanic %>% 
  pivot_count(cols = Sex, rows = c(Survived, Class, Age)) %>% 
  group_by(Age) %>% 
  gt::gt()

Back to Step 0, Observations: use existing tools to calculate proportions is many step process

feels like lots of gymnastics... a vis first approach is what we are after

tidy_titanic %>% 
  group_by(Sex, Survived) %>% 
  summarize(value = n()) %>% 
  group_by(Sex) %>% 
  mutate(prop = value/sum(value)) %>% 
  select(-value) %>% 
  pivot_wider(values_from = prop, names_from = Sex)

Step 2a. build a function where visual arrangement leads.

pivot_prop_script <- readLines("./R/pivot_prop.R")

Step 2b. using the pivot_prop

tidy_titanic %>% 
  pivot_prop(rows = Survived, cols = Class, within = Class)

tidy_titanic %>% 
  pivot_prop(rows = c(Survived, Sex), 
               cols = Class, 
               within = Survived)

tidy_titanic %>% 
  pivot_prop(rows = c(Survived, Sex), 
               cols = Class, 
               within = c(Survived, Sex))

tidy_titanic %>% 
  pivot_prop(rows = c(Survived, Sex), 
               cols = Class, 
               within = Survived, pivot = F) %>% 
  ggplot() +
  aes(x = Class, y = Sex) +
  facet_grid(rows = vars(Survived)) +
  geom_tile() +
  aes(fill = prop) + 
  geom_text(aes(label = prop %>% round(3)))

tidy_titanic %>% 
  pivot_prop(rows = c(Survived, Sex), 
               cols = Class, 
               within = c(Survived, Sex), pivot = F) %>% 
  ggplot() +
  aes(x = Class, y = 1) +
  facet_grid(rows = vars(Survived, Sex)) +
  geom_tile() +
  aes(fill = prop) + 
  geom_text(aes(label = prop %>% round(3)))

Reflections, questions, issues

Does this already exist?
~~How can API improve? possibly rows = vars(y00, y0, y), cols = vars(x). and within = vars(?, ?) This requires more digging into tidy eval. What about multiple x vars?~~ These changes implemented thanks to Brian and Shannon
~~How can internals improve? Also tidy eval is old I think. defaults for missing data.~~ Using new {{}} tidy eval within and across, and rlang::quo_is_null() thanks to Brian
What about summary columns, rows. Column totals, etc. Maybe not very tidy... maybe allowed w/ error message.
Ideas about different API - more like ggplot2, where you would specify new dimensions of your table after piping. Would require function to create non-data frame type object. Not sure about consistency dplyr/tidyr tools. These just return dataframes/tibble. I think being consistent with that expectation might be a good thing. Also less challenging to implement.

library(ggstamp)
set.seed(128790)
ggcanvas() + 
  stamp_tile(x = rnorm(500)/2.5, 
             y= rnorm(500)/2.5, 
             fill = "darkslateblue",
             alpha = runif(500, .25, 1),
             width = rnorm(500)/3,
             height = rnorm(500)/3, 
             color = alpha("black", .75)) + 
  stamp_point(size = 8, color = "grey85") + 
  stamp_spoke(angle = pi, size = 1.7, 
              radius = .65, color = "grey85") +
  stamp_point(size = 5, color = "grey85") + 
  stamp_point(size = 3, color = "grey55") + 

  stamp_polygon_holes(y0 = .25) + 
  stamp_text(label = "tidy", hjust = 1,
             color = "violetred", 
               vjust = -.2, size = 17) + 
  stamp_text(label = "pivot", hjust = -.1,
             angle = 0:25, color = "violetred", 
               vjust = -.2, size = 17, alpha = c(0:25/400)) + 
  stamp_spoke(angle = .45*0:25/25, size = 1.7, 
              radius = .85, color = "grey55", alpha = c(0:25/400)) +
  stamp_spoke(angle = .45, size = 1.7, 
              radius = .85, color = "grey55") +
  stamp_text(label = "pivot", hjust = -.1,
             angle = 25, color = "violetred", 
               vjust = -.2, size = 17) +
  ggstamp::theme_void_fill(fill = "darkslateblue") + 
  stamp_polygon(color = "lightslateblue", 
                alpha = 0, y0 = .25, size = 4)

Other work in this space

janitor::tably
pivottabler

janitor::tably

tidy_titanic %>% 
  janitor::tabyl(Sex, Survived) %>% 
  janitor::adorn_pct_formatting()

EvaMaeRey/tidypivot documentation built on Feb. 27, 2025, 4:04 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

EvaMaeRey/tidypivot
Tidy table calculations with visual api

In EvaMaeRey/tidypivot: Tidy table calculations with visual api

.Large[tidypivot]

.small[Some ideas]

.tiny[Dr. Evangeline Reynolds | 2022-04-26 |Image credit: William Iven, Upsplash]

Step 0. Some observations:

ggplot2: user needs to describe layout of graphic...

But a graphic may actually look a lot like a table!

horizontal position variables are both categorical...

But sometimes you may actually not need a graphic, but rather the raw numbers.

With existing pivot tools, description isn't so visual

can we make a more visual, declarative API?

Step 1a. Make Functions to allow description of final table, pivot_count and pivot_calc

x argument is horizontal elements (columns) and y is vertical elements (rows)

Step 1b. Using those functions

After examining your table you might actually want to have the calculation in long form (for use in something like ggplot2). This is what pivot = F is for!

1b. pivot_calc using pivot calc function for non count aggregation

just for fun arrange the code how the table will look

1b style. use another tool to style

goal of functions is not to style - just to make calculation faster by using a visually driven API

Back to Step 0, Observations: use existing tools to calculate proportions is many step process

feels like lots of gymnastics... a vis first approach is what we are after

Step 2a. build a function where visual arrangement leads.

Step 2b. using the pivot_prop

Reflections, questions, issues

Other work in this space

janitor::tably

R Package Documentation

Browse R Packages

We want your feedback!

EvaMaeRey/tidypivot Tidy table calculations with visual api

In EvaMaeRey/tidypivot: Tidy table calculations with visual api

.Large[tidypivot]

.small[Some ideas]

.tiny[Dr. Evangeline Reynolds | 2022-04-26 |Image credit: William Iven, Upsplash]

Step 0. Some observations:

ggplot2: user needs to describe layout of graphic...

But a graphic may actually look a lot like a table!

horizontal position variables are both categorical...

But sometimes you may actually not need a graphic, but rather the raw numbers.

With existing pivot tools, description isn't so visual

can we make a more visual, declarative API?

Step 1a. Make Functions to allow description of final table, pivot_count and pivot_calc

x argument is horizontal elements (columns) and y is vertical elements (rows)

Step 1b. Using those functions

After examining your table you might actually want to have the calculation in long form (for use in something like ggplot2). This is what pivot = F is for!

1b. pivot_calc using pivot calc function for non count aggregation

just for fun arrange the code how the table will look

1b style. use another tool to style

goal of functions is not to style - just to make calculation faster by using a visually driven API

Back to Step 0, Observations: use existing tools to calculate proportions is many step process

feels like lots of gymnastics... a vis first approach is what we are after

Step 2a. build a function where visual arrangement leads.

Step 2b. using the pivot_prop

Reflections, questions, issues

Other work in this space

janitor::tably

R Package Documentation

Browse R Packages

We want your feedback!

EvaMaeRey/tidypivot
Tidy table calculations with visual api