Description Usage Arguments Value References Examples
em_winkler_big
implements the same method when the data are too big to compute
the agreement matrix. Agreement is then recomputed on the fly each time it is needed. The EM steps
are completely done in C++. This decreases the RAM usage (still important though), at the cost of
increasing computational time.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | em_winkler(
data1,
data2,
tol = 0.001,
maxit = 500,
do_plot = TRUE,
oneone = FALSE,
verbose = FALSE
)
em_winkler_big(
data1,
data2,
tol = 0.001,
maxit = 500,
do_plot = TRUE,
oneone = FALSE,
verbose = FALSE
)
|
data1 |
either a binary ( |
data2 |
either a binary ( |
tol |
tolerance for the EM algorithm convergence. |
maxit |
maximum number of iterations for the EM algorithm. |
do_plot |
a logical flag indicating whether a plot should be drawn for the EM convergence.
Default is |
oneone |
a logical flag indicating whether 1-1 matching should be enforced.
If |
verbose |
a logical flag indicating whether intermediate values from the EM algorithm should
be printed. Useful for debugging. Default is |
a list containing:
matchingScore
a matrix of size n1 x n2
with the matching score for each n1*n2
pair.
threshold_ms
threshold value for the matching scores above which pairs are considered true matches.
estim_nbmatch
an estimation of the number of true matches (N
pairs
considered multiplied by p
the estimated proportion of true matches from the EM algorithm)
convergence_status
a logical flag indicating whether the EM algorithm converged
Winkler WE. Using the EM Algorithm for Weight Computation in the Fellegi-Sunter Model of Record Linkage. Proc Sect Surv Res Methods, Am Stat Assoc 1988: 667-71.
Grannis SJ, Overhage JM, Hui S, et al. Analysis of a probabilistic record linkage technique without human review. AMIA 2003 Symp Proc 2003: 259-63.
1 2 3 4 5 6 7 8 9 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.