get_dupes: Get rows of a 'data.frame' with identical values for the...

View source: R/get_dupes.R

get_dupesR Documentation

Get rows of a data.frame with identical values for the specified variables.

Description

For hunting duplicate records during data cleaning. Specify the data.frame and the variable combination to search for duplicates and get back the duplicated rows.

Usage

get_dupes(dat, ...)

Arguments

dat

The input data.frame.

...

Unquoted variable names to search for duplicates. This takes a tidyselect specification.

Value

Returns a data.frame with the full records where the specified variables have duplicated values, as well as a variable dupe_count showing the number of rows sharing that combination of duplicated values. If the input data.frame was of class tbl_df, the output is as well.

Examples

get_dupes(mtcars, mpg, hp)

# or called with the magrittr pipe %>% :
mtcars %>% get_dupes(wt)

# You can use tidyselect helpers to specify variables:
mtcars %>% get_dupes(-c(wt, qsec))
mtcars %>% get_dupes(starts_with("cy"))

janitor documentation built on Feb. 16, 2023, 10:16 p.m.