findPartialDuplicates: Find Paritally Duplicated Rows in a Data Frame
In KWB-R/kwb.utils: General Utility Functions Developed at KWB

findPartialDuplicates

R Documentation

Find Paritally Duplicated Rows in a Data Frame

Description

Find Rows in a data frame that are identical in the key columns but not identical in all columns

Usage

findPartialDuplicates(data, key_columns, skip_columns = NULL)

Arguments

`data`	data frame
`key_columns`	names of columns in `data` in which to look for duplicated (combined) values
`skip_columns`	names of columns to be skipped when looking for duplicated rows

Value

NULL if there are no rows in data that have identical values in the key_columns or if all groups of rows that have identical values in the key_columns are also identical in all the other columns (except for those named in skip_columns). Otherwise a list is returned with the one element per duplicate in the key columns. The list elements are subsets of data representing the rows of data that are identical in the key columns but different in at least one of the other columns.

Examples

findPartialDuplicates(key_columns = "id", data = rbind(
  data.frame(id = 1, value = 1),
  data.frame(id = 2, value = 2),
  data.frame(id = 2, value = 3),
  data.frame(id = 3, value = 3),
  data.frame(id = 3, value = 3),
  data.frame(id = 3, value = 3.1)
))

KWB-R/kwb.utils documentation built on April 1, 2024, 7:12 a.m.