The R package seplyr supplies improved standard evaluation interfaces for some common data data plying tasks.
To install this packing in R please either install from CRAN with:
install.packages('seplyr')
or from GitHub:
devtools::install_github('WinVector/seplyr')
In dplyr if you know the names of columns when you are writing code you can write code such as the following.
suppressPackageStartupMessages(library("dplyr")) packageVersion("dplyr") datasets::mtcars %>% arrange(cyl, desc(gear)) %>% head()
In dplyr 0.7.* if the names of the columns are coming from a variable set elsewhere you would to need to use a tool to substitute those names in. One such tool is rlang/tidyeval (though we strongly prefer seplyr.
# Assume this is set elsewhere, # supplied by a user, function argument, or control file. orderTerms <- c('cyl', 'desc(gear)')
If you don't want to try and digest entire theory of quasi-quoting and splicing (the !!! operator) then you can use seplyr which conveniently and legibly wraps the operations as follows:
library("seplyr") datasets::mtcars %.>% arrange_se(., orderTerms) %>% head(.)
The idea is: the above code looks very much like simple dplyr code used running an analysis, and yet is very easy to parameterize and re-use in a script or package.
seplyr::arrange_se() performs the wrapping for you without you having to work through the details of rlang. If you are interested in the details seplyr itself is a good tutorial. For example you can examine seplyr's implementation to see the necessary notations (using a command such as print(arrange_se)). And, of course, we try to supply some usable help entries, such as: help(arrange_se). Some more discussion of the ideas can be found here.
The current set of SE adapters includes (all commands of the form NAME_se() being adapters for a dplyr::NAME() method):
add_count_se()add_tally_se()arrange_se()count_se()distinct_se()filter_se()group_by_se()group_indices_se()mutate_se()rename_se()select_se()summarize_se()tally_se()transmute_se()Only two of the above are completely redundant. seplyr::group_by_se() essentially works as dplyr::group_by_at() and seplyr::select_se() essentially works as dplyr::select_at(). The others either have different semantics or currently (as of dplyr 0.7.1) no matching dplyr::*_at() method. Roughly all seplyr is trying to do is give a uniform first-class standard interface to all of the primary deprecated underscore suffixed verbs (such as dplyr::arrange_).
We also have a few methods that work around a few of the minor inconvenience of working with variable names as strings:
deselect()rename_mp()Here is a example using seplyr::summarize_se().
datasets::iris %.>% group_by_se(., "Species") %.>% summarize_se(., c("Mean.Sepal.Length" := "mean(Sepal.Length)", "Mean.Sepal.Width" := "mean(Sepal.Width)"))
In addition to the series of adapters we also supply a number of useful new verbs including:
group_summarize() Binds grouping, arrangement, and summarization together for clear documentation of intent.add_group_summaries() Adds per-group summaries to data.add_group_indices() Adds a column of per-group ids to data.add_group_sub_indices() Adds a column of in-group rank ids to data.add_rank_indices() Adds rank indices to data.partition_mutate_se(): vignette, and article.if_else_device(): article.seplyr is designed to be a thin package that passes work to dplyr. If you want a package that works around dplyr implementation differences on different data sources I suggest trying our own replyr package. Another alternative is using wrapr::let().
seplyr methods are short and have examples in their help, so always try both help and printing the method (for example: help(select_se) and print(select_se)). Printing methods can show you how to use dplyr directly with rlang/tidyeval methodology (allowing you to skip seplyr).
Some inspiration comes from Sebastian Kranz's s_dplyr.
Please see help("%.>%", package="wrapr") for details on "dot pipe."
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.