knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%", warning = FALSE, message = FALSE )
tidytable
is a data frame manipulation library for users who need data.table
speed but prefer tidyverse
-like syntax.
Install the released version from CRAN with:
install.packages("tidytable")
Or install the development version from GitHub with:
# install.packages("pak") pak::pak("markfairbanks/tidytable")
tidytable
replicates tidyverse
syntax but uses data.table
in the background. In general you can simply use library(tidytable)
to replace your existing dplyr
and tidyr
code with data.table
backed equivalents.
A full list of implemented functions can be found here.
library(tidytable) df <- data.table(x = 1:3, y = 4:6, z = c("a", "a", "b")) df %>% select(x, y, z) %>% filter(x < 4, y > 1) %>% arrange(x, y) %>% mutate(double_x = x * 2, x_plus_y = x + y)
You can use the normal tidyverse
group_by()
/ungroup()
workflow, or you can use .by
syntax to reduce typing. Using .by
in a function is shorthand for df %>% group_by() %>% some_function() %>% ungroup()
.
.by = z
.by = c(y, z)
df <- data.table(x = c("a", "a", "b"), y = c("a", "a", "b"), z = 1:3) df %>% summarize(avg_z = mean(z), .by = c(x, y))
All functions that can operate by group have a .by
argument built in.
(mutate()
, filter()
, summarize()
, etc.)
The above syntax is equivalent to:
df %>% group_by(x, y) %>% summarize(avg_z = mean(z)) %>% ungroup()
Both options are available for users, so you can use the syntax that you prefer.
tidytable
allows you to select/drop columns just like you would in the tidyverse by utilizing the tidyselect
package in the background.
Normal selection can be mixed with all tidyselect
helpers: everything()
, starts_with()
, ends_with()
, any_of()
, where()
, etc.
df <- data.table( a = 1:3, b1 = 4:6, b2 = 7:9, c = c("a", "a", "b") ) df %>% select(a, starts_with("b"))
A full overview of selection options can be found here.
.by
tidyselect
helpers also work when using .by
:
df <- data.table(x = c("a", "a", "b"), y = c("a", "a", "b"), z = 1:3) df %>% summarize(avg_z = mean(z), .by = where(is.character))
Tidy evaluation can be used to write custom functions with tidytable
functions.
The embracing shortcut {{ }}
works, or you can use enquo()
with !!
if you prefer:
df <- data.table(x = c(1, 1, 1), y = 4:6, z = c("a", "a", "b")) add_one <- function(data, add_col) { data %>% mutate(new_col = {{ add_col }} + 1) } df %>% add_one(x)
The .data
and .env
pronouns also work within tidytable
functions:
var <- 10 df %>% mutate(new_col = .data$x + .env$var)
A full overview of tidy evaluation can be found here.
dt()
helperThe dt()
function makes regular data.table
syntax pipeable, so you can easily mix tidytable
syntax with data.table
syntax:
df <- data.table(x = 1:3, y = 4:6, z = c("a", "a", "b")) df %>% dt(, .(x, y, z)) %>% dt(x < 4 & y > 1) %>% dt(order(x, y)) %>% dt(, double_x := x * 2) %>% dt(, .(avg_x = mean(x)), by = z)
For those interested in performance, speed comparisons can be found here.
tidytable
is only possible because of the great contributions to R by the data.table
and tidyverse
teams. data.table
is used as the main data frame engine in the background, while tidyverse
packages like rlang
, vctrs
, and tidyselect
are heavily relied upon to give users an experience similar to dplyr
and tidyr
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.