README.md

withDT

withDT features 2 functions, withDT() and %DT>%`() which make data.table syntax available for single calls without further class or attribute housekeeping.

Some benefits are :

We believe this package can help integrating some powerful data.table features in any workflow with minimal confusion.

Installation

Install with:

# install.packages("devtools")
devtools::install_github("moodymudskipper/withDT")

Assignments by reference

An important particularity of withDT() is that assignments are never done by reference. Though limiting it avoids the confusion and unintended behaviors that might come with them.

The syntax of these assignments is still supported but will return a copy. In order to fail explicitly whenever that syntax is used, the argument lock can be set to TRUE the withDT.lock option can be set with options(withDT.lock = TRUE).

Examples

withDT()

library(withDT)

We can use standard data.table syntax and the output will have the same class as the input.

iris2 <- withDT(iris[, .(meanSW = mean(Sepal.Width)), by = Species][,cbind(.SD, a =3)])
iris2
#>      Species meanSW a
#> 1     setosa  3.428 3
#> 2 versicolor  2.770 3
#> 3  virginica  2.974 3
class(iris2)
#> [1] "data.frame"

The following also works, but wouldn't have the same output with standard data.table code due to the way assignments by reference are handled.

However we think in these cases the behavior of withDT() is more likely to be expected.

iris3 <- withDT(iris[, .(meanSW = mean(Sepal.Width)), by = Species][,a:=3])
identical(iris2,iris3)
#> [1] TRUE

We see that iris wasn't modified

class(iris)
#> [1] "data.frame"
names(iris)
#> [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
#> [5] "Species"

To trigger an error when this syntax is used we can set lock to TRUE.

iris4 <- withDT(lock=TRUE,iris[, .(meanSW = mean(Sepal.Width)), by = Species][,b:=3])
#> Error in value[[3L]](cond): Syntax of assignment by reference is forbidden when `lock` is TRUE

%DT>%

The %DT>% pipe is another way to use data.table's power and syntax. It can in fact be more efficient than withDT().

It can be used on simple calls :

iris %DT>% .[, .(meanSW = mean(Sepal.Width)), by = Species]
#>      Species meanSW
#> 1     setosa  3.428
#> 2 versicolor  2.770
#> 3  virginica  2.974

Or be used as part of a pipe chain using magrittr 's operator %>%, for example we can mix data.table and tidyverse operations by doing:

library(tibble)
library(dplyr, warn.conflicts = FALSE)
#> Warning: package 'dplyr' was built under R version 3.6.1
iris %>%
  as_tibble() %DT>%
  .[, .(meanSW = mean(Sepal.Width)), by = Species][
    ,Species := as.character(Species)] %>%
  filter(startsWith(Species,"v"))
#> # A tibble: 2 x 2
#>   Species    meanSW
#>   <chr>       <dbl>
#> 1 versicolor   2.77
#> 2 virginica    2.97

Some caveats

A workaround to the latter is to include the following chunk at the top of your document :

```{r, echo = FALSE}
environment(withDT) <- .GlobalEnv
```


moodymudskipper/withDT documentation built on Nov. 4, 2019, 7:29 p.m.