sub_criteria | R Documentation |
Match criteria for record linkage with links
and episodes
sub_criteria(
...,
match_funcs = c(exact = diyar::exact_match),
equal_funcs = c(exact = diyar::exact_match),
operator = "or"
)
attrs(..., .obj = NULL)
eval_sub_criteria(x, ...)
## S3 method for class 'sub_criteria'
print(x, ...)
## S3 method for class 'sub_criteria'
format(x, show_levels = FALSE, ...)
## S3 method for class 'sub_criteria'
eval_sub_criteria(
x,
x_pos = seq_len(max(attr_eval(x))),
y_pos = rep(1L, length(x_pos)),
check_duplicates = TRUE,
depth = 0,
...
)
... |
Arguments passed to methods for |
match_funcs |
|
equal_funcs |
|
operator |
|
.obj |
|
x |
|
show_levels |
|
x_pos |
|
y_pos |
|
check_duplicates |
|
depth |
|
sub_criteria()
- Create a match criteria as a sub_criteria
object.
A sub_criteria
object contains attributes to be compared,
logical tests for the comparisons (see predefined_tests
for examples) and
another set of logical tests to determine identical records.
attrs()
- Create a d_attribute
object - a collection of atomic objects that can be passed to sub_criteria()
as a single attribute.
eval_sub_criteria()
- Evaluates a sub_criteria
object.
At each iteration of links
or episodes
, record-pairs are created from each attribute of a sub_criteria
object.
eval_sub_criteria()
evaluates each record-pair using the match_funcs
and equal_funcs
functions of a sub_criteria
object.
See predefined_tests
for examples of match_funcs
and equal_funcs
.
User-defined functions are also permitted as match_funcs
and equal_funcs
.
Such functions must meet three requirements:
It must be able to compare the attributes.
It must have two arguments named `x`
and `y`
, where `y`
is the value for one observation being compared against all other observations (`x`
).
It must return a logical
object i.e. TRUE
or FALSE
.
attrs()
is useful when the match criteria requires an interaction between the multiple attributes. For example, attribute 1 + attribute 2 > attribute 3.
Every attribute, including those in attrs()
, must have the same length or a length of 1.
sub_criteria
predefined_tests
; links
; episodes
; eval_sub_criteria
# Attributes
attr_1 <- c(30, 28, 40, 25, 25, 29, 27)
attr_2 <- c("M", "F", "U", "M", "F", "U", "M")
# A match criteria
## Example 1 - A maximum difference of 10 in attribute 1
s_cri1 <- sub_criteria(attr_1, match_funcs = range_match)
s_cri1
# Evaluate the match criteria
## Compare the first element of 'attr_1' against all other elements
eval_sub_criteria(s_cri1)
## Compare the second element of 'attr_1' against all other elements
x_pos_val <- seq_len(max(attr_eval(s_cri1)))
eval_sub_criteria(s_cri1,
x_pos = x_pos_val,
y_pos = rep(2, length(x_pos_val)))
## Example 2 - `s_cri1` AND an exact match on attribute 2
s_cri2 <- sub_criteria(
s_cri1,
sub_criteria(attr_2, match_funcs = exact_match),
operator = "and")
s_cri2
## Example 3 - `s_cri1` OR an exact match on attribute 2
s_cri3 <- sub_criteria(
s_cri1,
sub_criteria(attr_2, match_funcs = exact_match),
operator = "or")
s_cri3
# Evaluate the match criteria
eval_sub_criteria(s_cri2)
eval_sub_criteria(s_cri3)
# Alternatively, using `attr()`
AND_func <- function(x, y) range_match(x$a1, y$a1) & x$a2 == y$a2
OR_func <- function(x, y) range_match(x$a1, y$a1) | x$a2 == y$a2
## Create a match criteria
s_cri2b <- sub_criteria(attrs(.obj = list(a1 = attr_1, a2 = attr_2)),
match_funcs = AND_func)
s_cri3b <- sub_criteria(attrs(.obj = list(a1 = attr_1, a2 = attr_2)),
match_funcs = OR_func)
# Evaluate the match criteria
eval_sub_criteria(s_cri2b)
eval_sub_criteria(s_cri3b)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.