remove_duplicates: Remove duplicate entries from a data frame

Description Usage Arguments Value Examples

View source: R/import_and_clean_data.R

Description

Calls the deduplicate function from synthesisr to flag and remove duplicate entries from a data frame

Usage

1
remove_duplicates(df, field, method = c("string_osa", "fuzzdist", "exact"))

Arguments

df

the data frame to deduplicate

field

the name or index of the column to check for duplicate values

method

the manner of duplicate detection; exact removes exact text duplicates, stringdist removes duplicates below a similarity threshold, and fuzzy uses fuzzdist matching

Value

a deduplicated data frame

Examples

1
2
my_df <- data.frame(title=c("Picoides", "picoides", "Seiurus"), id=c("01", "02", "03"))
remove_duplicates(my_df, "title", "exact")

elizagrames/litsearchr documentation built on April 14, 2021, 3:42 p.m.