check_join_conflicts: Compare Values of Non-Joined Duplicate Variables After...

Description Usage Arguments Value Examples

View source: R/check_join_conflicts.R

Description

The data frame that results from joining two data frames using dplyr::*_join functions sometimes contains non-joined duplicate variables. For example, df1 and df2 may have each had a variable named first_name. If the user does not include first_name in the dplyr::*_join function, then the resulting joined data frame will include two fist name variables – first_name.x and first_name.y by default. Typically, the user will expect the values of first_name.x and first_name.y to match. However, that isn't always the case. The check_join_conflicts function checks for values that don't match.

Usage

1
2
check_join_conflicts(.data, suffix = c(".x", ".y"),
  show_context = TRUE)

Arguments

.data

The joined data frame – resulting from a dplyr::*_join function.

suffix

The suffix disambiguates non-joined duplicate variables. The default is x and y.

show_context

Show the other non-joined duplicate varibles from the same row of the joined data frame.

Value

a tibble

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
df1 <- tibble::tribble(
~id, ~first_name, ~gender,
1,   "john",      "m",
2,   "jane",      "f",
3,   "sally",     "f"
)

df2 <- tibble::tribble(
  ~id, ~first_name, ~gender,
  1,   "jon",       "m",
  2,   "jane",      "f",
  3,   "salle",     "f"
)

df3 <- dplyr::full_join(df1, df2, by = "id")
df3

#>  A tibble: 3 x 5
#>     id first_name.x gender.x first_name.y gender.y
#>  <dbl> <chr>        <chr>    <chr>        <chr>
#>      1 john         m        jon          m
#>      2 jane         f        jane         f
#>      3 sally        f        salle        f

check_join_conflicts(df3)

#>  A tibble: 3 x 4
#>  variable     row .x    .y
#>  <chr>      <int> <chr> <chr>
#>  first_name     1 john  jon
#>  first_name     3 sally salle
#>  gender        NA NA    NA

# Example with different suffix names

df4 <- df3
names(df4) <- stringr::str_replace_all(names(df4), "\\.x", ".medstar")
names(df4) <- stringr::str_replace_all(names(df4), "\\.y", ".aps")

check_join_conflicts(df4, suffix = c("medstar", "aps"))
#>  A tibble: 3 x 4
#>  variable     row .medstar .aps
#>  <chr>      <int> <chr>    <chr>
#>  first_name     1 john     jon
#>  first_name     3 sally    salle
#>  gender        NA NA       NA

brad-cannell/my_functions documentation built on July 25, 2019, 4:29 p.m.