generate_matches: Generate candidate matches (using fuzzy blocked postal codes)

Description Usage Arguments Value Examples

View source: R/generate_matches.R

Description

Given blocks of postal codes, join two datasets of firm names and addresses and return likely candidate matches.

Usage

1
2
generate_matches(tbl_x, tbl_y, name_var = "name", address_var = "address",
  block = NULL)

Arguments

tbl_x

A tbl of firm names, addresses and postal codes; for now, must have variables 'name', 'address', 'postal_code'

tbl_y

A tbl of firm names, addresses and postal codes; for now, must have variables 'name', 'address', 'postal_code'

block

A tbl of (postal code, postal code) blocked pairs on which to merge tbl_x and tbl_y to. If not supplied, block is calculated using fuzzy_block() in the first couples of lines, using only the postal codes in tbl_x and tbl_y.

Value

A tbl of (firm, firm) candidate matches with stringdist measures

Examples

1
2
3
4
5
6
brx <- br %>%
        select(name, address, postal_code) %>%
        mutate(name = standardize(name, dictionary = company_dictionary),
               address = address %>% standardize(dictionary = address_dictionary) %>% fix_unit_names())

generate_matches(brx, brx)

tweed1e/matchtools documentation built on May 29, 2019, 10:51 a.m.