View source: R/pair_blocking.R
pair_blocking | R Documentation |
Generates all combinations of records from x
and y
where the
blocking variables are equal.
pair_blocking(x, y, on, deduplication = FALSE, add_xy = TRUE)
x |
first |
y |
second |
on |
the variables defining the blocks or strata for which
all pairs of |
deduplication |
generate pairs from only |
add_xy |
add |
Generating (all) pairs of the records of two data sets, is usually the first step when linking the two data sets. However, this often results in a too large number of records. Therefore, blocking is usually applied.
A data.table
with two columns,
.x
and .y
, is returned. Columns .x
and .y
are
row numbers from data.frame
s .x
and .y
respectively.
pair
and pair_minsim
are other methods
to generate pairs.
data("linkexample1", "linkexample2")
pairs <- pair_blocking(linkexample1, linkexample2, "postcode")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.