phased_left_join | R Documentation |
We use three 'phases' to left-join lhs
and rhs
Exact match: well, exact match both on exact_by
and phased_by
Quasi-exact match: transform phased_by
vars on lhs
and rhs
using
quasi_fun
and then matches on the transformed variables. Default for
quasi_fun
is efun::normalize_text
, which removes spaces, dots, commas
and non-ascii characters to avoid encoding issues
Fuzzy match using a two-way 'contains' approach: this is powered by
fuzzyjoin::fuzzy_left_join
using as matching function
match_fun = ~ stringr::str_detect(.x, .y) | stringr::str_detect(.y, .x)
phased_left_join(
lhs,
rhs,
phased_by,
exact_by = NULL,
drop_join_vars = TRUE,
quasi_fun = normalize_text,
suffix = c(".x", ".y")
)
lhs |
A data.frame-like |
rhs |
A data.frame-like |
phased_by |
A character vector of variables to join by, using a named
vector to join by different variables from |
exact_by |
An optional character vector of variables to join by, using always an exact match. |
drop_join_vars |
Whether to drop auxiliary variables for the match (e.g. transformed variables). Defaults to TRUE, but for debugging may be useful to set to FALSE. |
quasi_fun |
The function to apply to |
suffix |
If there are non-joined duplicate variables in |
We apply those phases in that order, and every phase works only on the unmatched rows from previous phases.
a joined data-frame
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.