assert_unique: Simple assertion for jointly unique variable

Description Usage Arguments See Also Examples

Description

check_unique will check that variables are jointly unique and return TRUE/FALSE, along with a helpful message. assert_unique will stop if there are any duplicate rows.

Usage

1
2
3

Arguments

df

the input data.frame

...

unquoted variable names to check for joint uniqueness. Allows for non-standard evaluation

See Also

get_dups which allows you to interactively extract duplicates when an assertion fails

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
library(dplyr)
# Month and day should be jointly unique in the airquality dataset
airquality %>% check_unique(Month, Day)

# As part of an analysis pipeline
airquality %>%
    assert_unique(Month, Day) %>%
    summarise(ave_temp = mean(Temp))

# Example of message on false:
airquality %>% check_unique(Month)

# A very common workflow is:
# 1. Read data in and assert that it has the right unique IDs
# 2. Join and assert that the join result has the right unique IDs

reporters_str <- 'reporter_id,first_name,last_name\n1,"John","Smith"\n2,"Patty","Johnson"'
articles_str <- 'article_id,reporter_id,title\n1,1,"First article by John"\n2,1,"Second article by John"\n3,2,"First article by Patty"'

reporters <- readr::read_csv(reporters_str) %>% assert_unique(reporter_id)
articles <- readr::read_csv(articles_str) %>% assert_unique(article_id)

# Join reporter names into the articles data.frame,
# then check that you understood the relationship correctly and didn't create duplicates
articles_reporterinfo <- inner_join(articles, reporters, by = "reporter_id") %>%
    assert_unique(article_id)

rgknight/knightr documentation built on May 27, 2019, 7:22 a.m.