get_dupes: Get rows of a 'data.frame' with identical values for the...

View source: R/get_dupes.R

get_dupesR Documentation

Get rows of a data.frame with identical values for the specified variables.


For hunting duplicate records during data cleaning. Specify the data.frame and the variable combination to search for duplicates and get back the duplicated rows.


get_dupes(dat, ...)



The input data.frame.


Unquoted variable names to search for duplicates. This takes a tidyselect specification.


Returns a data.frame with the full records where the specified variables have duplicated values, as well as a variable dupe_count showing the number of rows sharing that combination of duplicated values. If the input data.frame was of class tbl_df, the output is as well.


get_dupes(mtcars, mpg, hp)

# or called with the magrittr pipe %>% :
mtcars %>% get_dupes(wt)

# You can use tidyselect helpers to specify variables:
mtcars %>% get_dupes(-c(wt, qsec))
mtcars %>% get_dupes(starts_with("cy"))

janitor documentation built on Feb. 16, 2023, 10:16 p.m.