duplicates: Identify duplicates and provide a summary

View source: R/data_wrangling.R

duplicatesR Documentation

Identify duplicates and provide a summary

Description

This function can be used to assess the number of duplicates in a dataset based on one or several columns.

Usage

duplicates(df, ..., output = "summary")

Arguments

df

Data frame or tibble object.

output

Character. Can be either 'summary', 'flag' or 'all'. 'summary' provides a df with the number of duplicates based on the columns provided. 'flag' provides the original df with a new column 'dup_flag' identifying the number of times the row is in the df and 'rank_flag' which indicates the occurrence of the row. Default is summary.

cols

Column names.

Value

A data frame with the summary or the data frame with the flags. If output is 'all' the return is a list with two data frames.


pablocal/pablo documentation built on June 14, 2024, 12:16 p.m.