data_unique | R Documentation |
From all rows with at least one duplicated ID,
keep only one. Methods for selecting the duplicated row are
either the first duplicate, the last duplicate, or the "best"
duplicate (default), based on the duplicate with the smallest
number of NA
. In case of ties, it picks the first
duplicate, as it is the one most likely to be valid and
authentic, given practice effects.
Contrarily to dplyr::distinct()
, data_unique()
keeps all columns.
data_unique(
data,
select = NULL,
keep = "best",
exclude = NULL,
ignore_case = FALSE,
regex = FALSE,
verbose = TRUE
)
data |
A data frame. |
select |
Variables that will be included when performing the required tasks. Can be either
If |
keep |
The method to be used for duplicate selection, either "best" (the default), "first", or "last". |
exclude |
See |
ignore_case |
Logical, if |
regex |
Logical, if |
verbose |
Toggle warnings. |
A data frame, containing only the chosen duplicates.
data_duplicated()
df1 <- data.frame(
id = c(1, 2, 3, 1, 3),
item1 = c(NA, 1, 1, 2, 3),
item2 = c(NA, 1, 1, 2, 3),
item3 = c(NA, 1, 1, 2, 3)
)
data_unique(df1, select = "id")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.