assert_no_duplicate: Assert that a dataframe has one row per patient
In EDCimport: Import Data from EDC Software

assert_no_duplicate

R Documentation

Assert that a dataframe has one row per patient

Description

Check that there is no duplicate on the column holding patient ID in a pipeable style.
Mostly useful after joining two datasets.

Usage

assert_no_duplicate(df, by = NULL, id_col = get_subjid_cols())

Arguments

`df`	a dataframe
`by`	(optional) grouping columns
`id_col`	the name of the columns holding patient ID

Value

the df dataset, unchanged

Examples

## Not run: 
#without duplicate => no error, continue the pipeline
tibble(subjid=c(1:10)) %>% assert_no_duplicate() %>% nrow()

#with duplicate => throws an error
tibble(subjid=c(1:10, 1:2)) %>% assert_no_duplicate() %>% nrow()

#By groups
df = tibble(subjid=rep(1:10, 4), visit=rep(c("V1", "V2"), 2, each=10), 
            group=rep(c("A", "B"), each=20))
df %>% assert_no_duplicate() #error
df %>% assert_no_duplicate(by=c(visit, group)) #no error

## End(Not run)

EDCimport documentation built on April 4, 2025, 1:18 a.m.