match_name: Match company names

Description Usage Arguments Value Examples

Description

Match company names

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
match_name(
  .tab0,
  .tab1,
  .col_match,
  .type = c("full", "sub", "approx"),
  .min_char = 0.25,
  .max_dist = 0.1,
  .method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw",
    "soundex"),
  .workers = 1,
  .progress = FALSE
)

Arguments

.tab0

Company Table

.tab1

Matching Table

.col_match

Column name used for matching

.type

c("full", "sub", "approx")

.min_char

only used if .type == "sub"

.max_dist

only used if .type == "approx"

.method

only used if .type == "approx"

.workers

Number of parallel workers (only used for .type == "sub")

.progress

Show progress bar?

Value

A Dataframe

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
.path <- system.file("extdata", "test_tables.xlsx", package = "RFirmMatch")
.tab1 <- openxlsx::read.xlsx(.path, 1)
.tab1 <- .tab1 %>% prepare_tables() %>% extract_legal_form(make_legal_form_table())

.tab2 <- openxlsx::read.xlsx(.path, 2)
.tab2 <- .tab2 %>% prepare_tables() %>% extract_legal_form(make_legal_form_table())

match_name(.tab1, .tab2, name_clean, "full")
match_name(.tab1, .tab2, name_clean, "sub")
match_name(.tab1, .tab2, name_clean, "approx")


## DEBUG
.col_match <- quote(name_clean)
.min_char = 0.8
.max_dist = .1
.method = "osa"
.workers = 1

MatthiasUckert/RFirmMatch documentation built on Dec. 17, 2021, 3:18 a.m.