Description Usage Arguments Details Value Examples
View source: R/select_n_to_m.R
Select matching pairs enforcing one-to-one linkage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
pairs |
a |
threshold |
the threshold to apply. Pairs with a score above the threshold are selected. |
weight |
name of the score/weight variable of the pairs. When not given
and |
var |
the name of the new variable to create in pairs. This will be a
logical variable with a value of |
preselect |
a logical variable with the same length as |
id_x |
a integer vector with the same length a the number of rows in
|
id_y |
a integer vector with the same length a the number of rows in
|
... |
passed on to other methods. |
n |
the number of records from |
m |
the number of records from |
Both methods force one-to-one matching. select_greedy
uses a greedy
algorithm that selects the first pair with the highest weight.
select_n_to_m
tries to optimise the total weight of all of the
selected pairs. In general this will result in a better selection. However,
select_n_to_m
uses much more memory and is much slower and, therefore,
can only be used when the number of possible pairs is not too large.
Returns the pairs
with the variable given by var
added. This
is a logical variable indicating which pairs are selected a matches.
1 2 3 4 5 6 7 8 9 | data("linkexample1", "linkexample2")
pairs <- pair_blocking(linkexample1, linkexample2, "postcode")
pairs <- compare_pairs(pairs, c("lastname", "firstname", "address", "sex"))
pairs <- score_simsum(pairs)
# Select pairs with a simsum > 5 and force one-to-one linkage
pairs <- select_n_to_m(pairs, 0, var = "ntom")
pairs <- select_greedy(pairs, 0, var = "greedy")
table(pairs[c("ntom", "greedy")])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.