gsmolinski/dedupewider: Deduplication Across Multiple Columns

Duplicated data can exist in different rows and columns and user may need to treat observations (rows) connected by duplicated data as one observation, e.g. companies can belong to one family (and thus: be one company) by sharing some telephone numbers. This package allows to find connected rows based on data on chosen columns and collapse it into one row.

Getting started

Package details

Maintainer
LicenseMIT + file LICENSE
Version0.1.1
URL https://github.com/gsmolinski/dedupewider
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("gsmolinski/dedupewider")
gsmolinski/dedupewider documentation built on April 17, 2025, 1:07 p.m.