check_joins | R Documentation |
Before and after using joins, it's important to check the counts of the data sets. This funtions give you the counts for the data you intend to join and the resulting counts for all forms of joins (e.g. left, right, inner, semi, anti, full).
An inner_join()
only keeps observations from x
that have a matching key
in y
.
The most important property of an inner join is that unmatched rows in either input are not included in the result. This means that generally inner joins are not appropriate in most analyses, because it is too easy to lose observations.
The three outer joins keep observations that appear in at least one of the data frames:
A left_join()
keeps all observations in x
.
A right_join()
keeps all observations in y
.
A full_join()
keeps all observations in x
and y
.
semi_join()
return all rows from x
with a match in y
.
anti_join()
return all rows from x
without a match in y
.
check_joins(x, y, by = NULL)
x , y |
A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details. |
by |
A join specification created with If To join on different variables between To join by multiple variables, use a
For simple equality joins, you can alternatively specify a character vector
of variable names to join by. For example, |
A tibble.
library(dplyr)
check_joins(x = band_members,
y = band_instruments,
by = "name")
check_joins(x = band_members,
y = band_instruments,
by = join_by(name))
band_members <- band_members |>
mutate(name.x = name,
name.a = name)
band_instruments <- band_instruments |>
mutate(name.y = name,
name.b = name)
check_joins(x = band_members,
y = band_instruments,
by = c("name.x" = "name.y"))
check_joins(x = band_members,
y = band_instruments,
by = c("name.x" = "name.y",
"name.a" = "name.b"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.