Functions for merging two unlinked datasets. The central function of this package is "merge_plus",which extends base R merge functionality to include fuzzy string matching, match scoring based on the similarity of common variables between the two datasets, filtering based on a calculated match score or a user-inputed function, match evaluation (see match_evaluate), and safe merge checks. Other functions include: -match_evaluate, which produces standard matching statistics including percent matched, and duplicate ratios, -tier_match, which is a wrapper for merge_plus that allows you match two datasets in sequential tiers with gradually looser parameters, -calculate_weights, a function that estimates the ability of a common variable to correctly identify a match or a non-match based on the record linkage literature, -clean_strings, a general string cleaning function optimized for company names. See "match_template.R" in the "examples" folder for a self-contained tutorial on the functionality of this package and template for your own matching program.
|License||GNU GENERAL PUBLIC LICENSE|
|Package repository||View on GitHub|
Install the latest version of this package by entering the following in R:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.