View source: R/epi_clean_compare_dup_rows.R
epi_clean_compare_dup_rows | R Documentation |
Compare two rows which may have duplicated information. epi_clean_compare_dup_rows() uses compare::compare() for possibly duplicated rows. compare::compare allows all transformations, sorting, etc. so can be loose. This function is intended to make manual inspection easier, compare::compare can miss differences though so care is needed.
epi_clean_compare_dup_rows(
df_dups = NULL,
val_id = "1",
col_id = "",
sub_index_1 = 1,
sub_index_2 = 2,
allowAll = TRUE,
...
)
df_dups |
a data frame with duplicated entries to compare |
val_id |
is a value that is thought to be duplicated (eg a repeating row ID), passed as a string. Grep is used to search for duplicates without regex with fixed = TRUE |
col_id |
is a string to indicate an ID column |
sub_index_1 |
default = 1 |
sub_index_2 |
default = 2 |
allowAll |
compare::compare option |
... |
pass any other options from compare::compare() |
returns a list object with the differing columns ('differing_cols'), their names ('col_names') and the duplicated indices
Antonio Berlanga-Taylor <https://github.com/AntonioJBT/episcout>
compare
, grepl
## Not run:
# Data frame object with rows thought to have duplicated entries:
check_dups
# Specify the row ID to grep where duplicate values are expected:
val_id <- '2'
comp <- epi_clean_compare_dup_rows(check_dups, val_id, 'var_id', 1, 2)
comp
View(t(check_dups[comp$duplicate_indices, ]))
View(t(check_dups[comp$duplicate_indices, comp$differing_cols]))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.