query_list: Data validation queries across a list of data frames

View source: R/query_list.R

query_listR Documentation

Data validation queries across a list of data frames

Description

Find observations matching a query that concerns one or more data frames within a list of data frames, and return tidy, stackable output. Like query but enables query expressions that reference variables in multiple data frames.

If the query expression references variables from data frames (i.e. list elements) other than the focal element, the relevant variable(s) will be joined to the focal element before the query expression is evaluated, see arguments join_type and join_by below.

Usage

query_list(
  x,
  cond,
  element,
  cols_base,
  join_type = "left",
  join_by,
  pivot_long = TRUE,
  pivot_var = "variable",
  pivot_val = "value",
  as_chr = TRUE
)

Arguments

x

A list of data frames

cond

Expression to evaluate with respect to one or more variables in one or more of the data frames within x.

element

Name or integer index of the focal list element of x for the given query. If the query expression cond references variables from list elements apart from element, the relevant variable(s) will be joined to x[[element]] before the query expression is evaluated, based on the join_type and join_by arguments described below.

cols_base

(Optional) Tidy-selection of other columns within data to retain in the output. Can optionally be set for an entire session using option "queryr_cols_base", e.g. options(queryr_cols_base = quote(id:site)).

join_type

If cond references variables within elements of x apart from x[[element]], what type of join should be used to join the relevant elements? Options are "left" (the default) and "inner". Based on dplyr join types.

join_by

A character vector of variables to join by. If the join key columns have different names in x[[element]] and x[[other]], use a named vector. For example, join_by = c("a" = "b") will match x[[element]]$a to x[[other]]$b.

pivot_long

Logical indicating whether to pivot the variables referenced within cond to a long (i.e. stackable) format, with default column names "variable1", "value1", "variable2", "value2", ... Defaults to TRUE.

pivot_var

Prefix for pivoted variable column(s). Defaults to "variable". Only used if pivot_long = TRUE.

pivot_val

Prefix for pivoted value column(s). Defaults to "value". Only used if pivot_long = TRUE.

as_chr

Logical indicating whether to coerce the columns referenced in the query expression cond to character prior to returning. This enables row-binding multiple queries with variables of different classes, but is only important if pivot_long = TRUE. Defaults to TRUE.

Value

A data frame reflecting the rows of x[[element]] that match the given query. Returned columns include:

  • (optional) columns matched by argument cols_base

  • columns referenced within the query expression (pivoted to long form by default)


epicentre-msf/queryr documentation built on July 17, 2025, 12:22 a.m.