find_duplicates: Find duplicates / non-unique values in a variable

Description Usage Arguments Value Examples

View source: R/standard_checks.R

Description

Find duplicates / non-unique values in a variable

Usage

1
find_duplicates(data, duplicate.column.name)

Arguments

data

a dataframe

duplicate.column.name

the name of the column the dataframe to be checked for duplicates as a string (in quotes)

Value

A dataframe with one row per potential issue. It has columns for the corresponding row index in the original data; the suspicious value; the variable name in the original dataset in which the suspicious value occured; A description of the issue type.

Examples

1
2
3
4
# a test dataset with 1000 rows; one numeric variable and one id variable
testdf <- data.frame(numeric_var = runif(10), unique_ids = c(1, 2, 3, 4, 5, 6, 7, 8, 1, 3))
# find duplicates in the unique_ids column:
find_duplicates(data, "unique_ids")

ellieallien/cleaninginspectoR documentation built on July 18, 2019, 12:30 p.m.