ys_filter: Subset yspec items using column values

View source: R/ys-tidy.R

ys_filterR Documentation

Subset yspec items using column values

Description

The intended use is to subset based on variables define in the dots list, however some internal column data is also available to query.

Usage

ys_filter(x, expr, .default = NULL, .enclos = parent.frame())

Arguments

x

A yspec object.

expr

An unquoted expression.

.default

A named list or environment containing defaults for expr; consider using ys_fill_dots() as an alternative to passing .default.

.enclos

An enclosing environment for evaluating expr.

Details

The following fields always exist in the spec and are available for querying in the filter expression:

  • col: column name ⁠<character>⁠

  • type: data type ⁠<character>⁠; either numeric, character, or integer

  • discrete: discrete data flag ⁠<logical>⁠; yspec sets this to TRUE when the values field is populated

  • continuous: continuous data flag ⁠<logical>⁠; yspec sets this to TRUE when the range field is populated

  • short: the short name ⁠<character>⁠

  • do_lookup: lookup indicator; yspec sets this to TRUE when some or all of the column data is defined by an external lookup file

The following fields will be provided defaults when the filter expression is evaluated:

  • unit: as specified by the user ⁠<character>⁠; default value is ""

  • covariate: as specified by the user in dots ⁠<logical>⁠; default value is FALSE

In addition to these fields, you can build the filter expression using items in the dots field.

Value

A yspec object

Evaluation environment

In order to determine if any column should get selected, ys_filter(), builds an environment and evaluates expr in that environment. Columns are selected only if expr evaluates to TRUE (via isTRUE()).

The environment is comprised of pre-existing data items in the spec (e.g. col or short; these items are always present), data items in the enclosing environment (.enclos; this defaults to parent.frame()), the .defaults list (passed by the user at run time) and the .dots list associated with each column.

Users are encouraged to filter based on logical data items in dots that are set through the flags field in SETUP__. When flags are set, every column is given a logical data item that can always be evaluated for every column. This is the safest and simplest way to go and should be the target usage. In case more complicated applications are required, users can appeal to data items in dots which may or may not be logical. Of course, the user can enter data into dots for every column to ensure that data item is available for evaluating expr but that might not be very convenient. In that case, pass a list of .defaults that will be used when the filter variable isn't available in dots. As an alternative to using .defaults, the user can run the spec object through ys_dots_fill which will fill in dots data items with default values only when they don't exist. This is probably more convenient, but the user is warned that these data items in dots stay with the yspec object for the life of the object.

See Also

ys_rename(), ys_join(), ys_select(), ys_fill_dots()

Examples

spec <- ys_help$spec()

ys_filter(spec, is.character(decode))

ys_filter(spec, unit == "kg" | type == "character")

ys_filter(spec, covariate)


metrumresearchgroup/yspec documentation built on Dec. 22, 2024, 1:37 a.m.