miss_combine_duplicate_vars: Combine duplicate variables from a join back into single...

miss_combine_duplicate_varsR Documentation

Combine duplicate variables from a join back into single variables

Description

Looks for variable pairs with names ending in ".x" and ".y" and combines them into a single variable. This is done so that no new missing data is introduced. Will not work for second order duplicates from joins, e.g. var.x.x.

Usage

miss_combine_duplicate_vars(x, vars = NULL, priority = "x")

Arguments

x

A data frame

vars

A character vector of variable names to consider. If NULL, all variables are considered.

priority

Which variable to prioritize when combining. Default is "x".

Value

A data frame

Examples

d1 = tibble(
id = 1:3,
y = c(1, 2, 3),
x = c(1, NA, NA)
)

d2 = tibble(
id = 1:3,
x = c(NA, 2, NA),
z = c(1, 2, 3)
)

d3 = tibble(
id = 1:3,
x = c(NA, NA, 3),
a = letters[1:3]
)

d1 %>%
  left_join(d2, by = "id") %>%
  miss_combine_duplicate_vars() %>%
  left_join(d3, by = "id") %>%
  miss_combine_duplicate_vars()

Deleetdk/kirkegaard documentation built on Feb. 28, 2025, 5:04 p.m.