View source: R/find_and_remove_duplicates.R
find_duplicates | R Documentation |
Identify and return duplicated rows in a data frame or linelist.
find_duplicates(data, target_columns = NULL)
data |
A data frame or linelist. |
target_columns |
A vector of columns names or indices to consider when
looking for duplicates. When the input data is a |
A data frame or linelist of all duplicated rows with following 2 additional columns:
row_id
: the indices of the duplicated rows from the input data.
Users can choose from these indices, which row they consider as
redundant in each group of duplicates.
group_id
: a unique identifier associated to each group of
duplicates.
dups <- find_duplicates(
data = readRDS(system.file("extdata", "test_linelist.RDS",
package = "cleanepi")),
target_columns = c("dt_onset", "dt_report", "sex", "outcome")
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.