display_duplicates: Display duplicate surveys data

Description Usage Arguments Value Note Examples

View source: R/dailyLogicChecks.R

Description

This function displays the duplicate surveys, given a unique identifier (which could be a single variable, or a combination of variables). In case, its a combination of variables (that uniquely identifies the underlying dataset in df), then this function will show duplicate surveys for that joint (combination) unique identifier variable.

Usage

1
display_duplicates(df, uniq_identifier_col)

Arguments

df

dataset(tibble/data.frame) object from which the uniq_identifier_col is chosen.

uniq_identifier_col

a character vector of column name(s) that uniquely identifies the dataset. In here, tidyselect can be used to select columns. See examples below.

Value

a tibble that displays the duplicates for the given unique identifiers in uniq_identifier_col.

Note

If for example, surveyor_id is the unique identifier col, and after running display_duplicates function, we find that, there are 5 rows with surveyor_id = 122. Now, the output tibble from this function should be interpreted as follows: 4 of 5 of the surveyor_id = 122 are duplicates, 1 of 5 is original.

Examples

1
2
3
4
5
6
count_duplicates(df = dataObj, uniq_identifier_col = c("ID"))

count_duplicates(df = dataObj, uniq_identifier_col = c("ID", "Name"))

count_duplicates(df = dataObj, uniq_identifier_col =
tidyselect::contains("abc")) # tidyselection used to select columns.

AarshBatra/econR documentation built on Dec. 17, 2021, 6:45 a.m.