seplyr

The R package seplyr supplies improved standard evaluation interfaces for some common data data plying tasks.

To install this packing in R please either install from CRAN with:

   install.packages('seplyr')

or from GitHub:

   devtools::install_github('WinVector/seplyr')

In dplyr if you know the names of columns when you are writing code you can write code such as the following.

suppressPackageStartupMessages(library("dplyr"))
packageVersion("dplyr")

datasets::mtcars %>% 
  arrange(cyl, desc(gear)) %>% 
  head()

In dplyr 0.7.* if the names of the columns are coming from a variable set elsewhere you would to need to use a tool to substitute those names in. One such tool is rlang/tidyeval (though we strongly prefer seplyr.

# Assume this is set elsewhere,
# supplied by a user, function argument, or control file.
orderTerms <- c('cyl', 'desc(gear)')

If you don't want to try and digest entire theory of quasi-quoting and splicing (the !!! operator) then you can use seplyr which conveniently and legibly wraps the operations as follows:

library("seplyr")

datasets::mtcars %.>% 
  arrange_se(., orderTerms) %>% 
  head(.)

The idea is: the above code looks very much like simple dplyr code used running an analysis, and yet is very easy to parameterize and re-use in a script or package.


seplyr::arrange_se() performs the wrapping for you without you having to work through the details of rlang. If you are interested in the details seplyr itself is a good tutorial. For example you can examine seplyr's implementation to see the necessary notations (using a command such as print(arrange_se)). And, of course, we try to supply some usable help entries, such as: help(arrange_se). Some more discussion of the ideas can be found here.

The current set of SE adapters includes (all commands of the form NAME_se() being adapters for a dplyr::NAME() method):

Only two of the above are completely redundant. seplyr::group_by_se() essentially works as dplyr::group_by_at() and seplyr::select_se() essentially works as dplyr::select_at(). The others either have different semantics or currently (as of dplyr 0.7.1) no matching dplyr::*_at() method. Roughly all seplyr is trying to do is give a uniform first-class standard interface to all of the primary deprecated underscore suffixed verbs (such as dplyr::arrange_).

We also have a few methods that work around a few of the minor inconvenience of working with variable names as strings:

Here is a example using seplyr::summarize_se().

datasets::iris %.>%
  group_by_se(., "Species") %.>%
  summarize_se(., c("Mean.Sepal.Length" := "mean(Sepal.Length)", 
                    "Mean.Sepal.Width" := "mean(Sepal.Width)"))

In addition to the series of adapters we also supply a number of useful new verbs including:

seplyr is designed to be a thin package that passes work to dplyr. If you want a package that works around dplyr implementation differences on different data sources I suggest trying our own replyr package. Another alternative is using wrapr::let().

seplyr methods are short and have examples in their help, so always try both help and printing the method (for example: help(select_se) and print(select_se)). Printing methods can show you how to use dplyr directly with rlang/tidyeval methodology (allowing you to skip seplyr).

Some inspiration comes from Sebastian Kranz's s_dplyr. Please see help("%.>%", package="wrapr") for details on "dot pipe."



Try the seplyr package in your browser

Any scripts or data that you put into this service are public.

seplyr documentation built on Sept. 5, 2021, 5:12 p.m.