knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

dtplyr

Travis build status

The tidyverse has made it far easier to perform data work in R. But there are still times where performance is important, and for tabular data that's when data.table can really shine.

For some reason, however, there seems to be an either / or mentality when it comes to picking between data.table and dplyr. I don't buy it. I'd rather have the best of both worlds.

That's why I made these handy adapters. Stick with your tidy data flow, but go ahead and cherry-pick some performance wins when you need them.

Installation

NOTE: This is a lie right now. Dare to dream.

You can install the released version of dtplyr from CRAN with:

install.packages("dtplyr")

Development version

# install.packages("devtools")
devtools::install_github("mcskinner/dtplyr")

Usage

Aggregation with dt_agg() uses the lean syntax you're used to from data.table:

library(dtplyr)
dt_agg(iris, mean(Sepal.Length), Species)

You'll notice that aggregation skipped the filter. That's exposed by dt_op() if you need it back:

dt_op(iris, Sepal.Width >= 3, mean(Sepal.Length), Species)

And why not leverage the high-performance merge routine as well?

colors <- data.frame(Species = c('setosa', 'versicolor', 'virginica'), Color = c('baby blue', 'lavender', 'purple'))
head(dt_merge(iris, colors))


mcskinner/dtplyr documentation built on Nov. 4, 2019, 6:23 p.m.