query2 | R Documentation |
Find observations matching a query that concerns two data frames, and return tidy, stackable output. Entails three steps:
separately query each of the two data frames using
query
combine the resulting query outputs based on a given join type (semi, anti, left, or inner)
execute a third query on the joined output
Each of the query steps is optional — unspecified query expressions are
replaced with TRUE
such that all rows of the relevant input are returned.
query2(
data1,
data2,
cond1,
cond2,
cols_base1,
cols_base2,
join_type,
join_by,
cond3,
pivot_long = TRUE,
pivot_var = "variable",
pivot_val = "value",
as_chr = TRUE
)
data1 |
Data frame to query (#1) |
data2 |
Data frame to query (#2) |
cond1 |
(Optional) Expression to evaluate with respect to |
cond2 |
(Optional) Expression to evaluate with respect to |
cols_base1 |
(Optional) Tidy-selection of other columns within |
cols_base2 |
(Optional) Tidy-selection of other columns within |
join_type |
How to join the output from the two initial queries ("semi", "anti",
"left", or "inner"). Based on dplyr |
join_by |
A character vector of variables to join by. If the join key
columns have different names in |
cond3 |
(Optional) Expression to evaluate with respect to the joined
output of the two initial queries. If missing will be set to Note that if If |
pivot_long |
Logical indicating whether to pivot the variables
referenced within the query expression(s) to a long (i.e. stackable)
format, with default column names "variable1", "value1", "variable2",
"value2", ... Defaults to |
pivot_var |
Prefix for pivoted variable column(s). Defaults to
"variable". Only used if |
pivot_val |
Prefix for pivoted value column(s). Defaults to "value".
Only used if |
as_chr |
Logical indicating whether to coerce the columns referenced in
the query expression(s) to character prior to returning. This enables
row-binding multiple queries with variables of different classes, but is
only important if |
A data frame reflecting the rows of data1
that match the given
query. Returned columns include:
Columns matched by argument cols_base1
Columns matched by argument cols_base2
(only if join type is "left" or
"inner")
Columns referenced within the relevant condition statements (pivoted to
long form by default).
If the join type is a mutating join ("left" or "inner"), variables from
data1
or data2
referenced in any of the condition statements
(cond1
, cond2
, or cond3
) will appear in the output. However, with a
filtering join ("anti" or "semi") only variables from data1
will appear
in the output.
# example datasets: two related epidemiological linelists
data(ll) # ll from treatment center (all cases, confirmed and non-confirmed)
data(sll) # summary linelist (only confirmed/probable cases)
# find patients in ll that don't appear in sll
query2(
ll,
sll,
cols_base1 = c(id, site, status),
join_type = "anti",
join_by = c("id" = "tc_id")
)
# find patients with different outcome status in ll vs sll
query2(
ll,
sll,
cols_base1 = id:site,
join_type = "inner",
join_by = c("id" = "tc_id"),
cond3 = status != sll_status
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.