A cornerstone feature of
prt is the ability to load a (small) subset of
rows (or columns) from a much larger tabular dataset. In order to specify
such a subset, an implementation of the base R S3 generic function
subset() is provided, driving the non-standard evaluation (NSE) of an
expression within the context of the data (with similar semantics as the
base R implementation for
1 2 3 4 5 6 7 8 9 10
object to be subsetted.
logical expression indicating elements or rows to keep: missing values are taken as false.
expression, indicating columns to select from a data frame.
Logical flag indicating whether the
passed on to
further arguments to be passed to or from other methods.
The environment in which
The functions powering NSE are
rlang::enquo() which quote the
select arguments and
rlang::eval_tidy() which evaluates the
expressions. This allows for some
rlang-specific features to be used, such as the
.env pronouns, or the double-curly brace forwarding operator. For
some example code, please refer to
vignette("prt", package = "prt").
While the function
subset() quotes the arguments passed as
select, the function
subset_quo() can be used to operate on already
quoted expressions. A final noteworthy departure from the base R interface
part_safe argument: this logical flag indicates whether it is safe
to evaluate the expression on partitions individually or whether
dependencies between partitions prevent this from yielding correct results.
As it is not straightforward to determine if dependencies might exists from
the expression alone, the default is
FALSE, which in many cases will
result in a less efficient resolution of the row-selection and it is up to
the user to enable this optimization.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
dat <- as_prt(mtcars, n_chunks = 2L) subset(dat, cyl == 6) subset(dat, cyl == 6 & hp > 110) colnames(subset(dat, select = mpg:hp)) colnames(subset(dat, select = -c(vs, am))) sub_6 <- subset(dat, cyl == 6) thresh <- 6 identical(subset(dat, cyl == thresh), sub_6) identical(subset(dat, cyl == .env$thresh), sub_6) cyl <- 6 identical(subset(dat, cyl == cyl), data.table::as.data.table(dat)) identical(subset(dat, cyl == !!cyl), sub_6) identical(subset(dat, .data$cyl == .env$cyl), sub_6) expr <- quote(cyl == 6) # passing a quoted expression to subset() will yield an error ## Not run: subset(dat, expr) ## End(Not run) identical(subset_quo(dat, expr), sub_6) identical( subset(dat, qsec > mean(qsec), part_safe = TRUE), subset(dat, qsec > mean(qsec), part_safe = FALSE) )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.