View source: R/pair_blocking.R
| pair_blocking | R Documentation |
Generates all combinations of records from x and y where the
blocking variables are equal.
pair_blocking(x, y, on, deduplication = FALSE, add_xy = TRUE)
x |
first |
y |
second |
on |
the variables defining the blocks or strata for which
all pairs of |
deduplication |
generate pairs from only |
add_xy |
add |
Generating (all) pairs of the records of two data sets, is usually the first step when linking the two data sets. However, this often results in a too large number of records. Therefore, blocking is usually applied.
A data.table with two columns,
.x and .y, is returned. Columns .x and .y are
row numbers from data.frames .x and .y respectively.
pair and pair_minsim are other methods
to generate pairs.
data("linkexample1", "linkexample2")
pairs <- pair_blocking(linkexample1, linkexample2, "postcode")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.