Description Usage Arguments Details See Also Examples
rm_dup()
finds all rows in a
data frame which share the same
entry for a target column and returns a data frame where only the first
or last of each set of duplicates is retained.
1 |
df |
data frame |
ind_col |
character value giving the name of the column to be searched for duplicate entries |
keep_last |
logical value indicating if the last, instead of the first, of each set of duplicates should be retained. defaults to FALSE, i.e. to retaining the first of each set of duplicates. |
rm_na |
logical value. if set to TRUE, rows with NA in the specified column are removed. |
rm_dup
finds all rows in a data frame which share the same
entry for a target column and returns a data frame where duplicates
have been removed.
For each set of duplicates, in a first step, the
row with the most non-missing/non-NA values
is retained. In a second step, if there are
duplicate rows where more than one row
has non-NA values for all columns, either the first
(keep_last=FALSE) or the last (keep_last=TRUE) row in
the set of duplicates is kept. Rows with NA entries in the
target column are left as they are
(even if there are multiple NA's). If no duplicates
are found, the data frame is returned as-is.
1 2 3 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.