matchesLink | R Documentation |
matchesLink produces two dataframes that store all the pairs that share a pattern that conforms to the an interval of the Fellegi-Sunter weights
matchesLink(gammalist, nobs.a, nobs.b, em, thresh, n.cores = NULL)
gammalist |
A list of objects produced by either gammaKpar or gammaCKpar. |
nobs.a |
number of observations in dataset 1 |
nobs.b |
number of observations in dataset 2 |
em |
parameters obtained from the Expectation-Maximization algorithm under the MAR assumption. These estimates are produced by emlinkMARmov |
thresh |
is the interval of posterior zeta values for the agreements that we want to examine closer. Ranges between 0 and 1. Can be a vector of length 1 (from specified value to 1) or 2 (from first specified value to second specified value). |
n.cores |
Number of cores to parallelize over. Default is NULL. |
matchesLink
returns an nmatches X 2 matrix with the indices of the
matches rows in dataset A and dataset B.
Ted Enamorado <ted.enamorado@gmail.com>, Ben Fifield <benfifield@gmail.com>, and Kosuke Imai
## Not run:
## Calculate gammas
g1 <- gammaCKpar(dfA$firstname, dfB$firstname)
g2 <- gammaCKpar(dfA$middlename, dfB$middlename)
g3 <- gammaCKpar(dfA$lastname, dfB$lastname)
g4 <- gammaKpar(dfA$birthyear, dfB$birthyear)
## Run tableCounts
tc <- tableCounts(list(g1, g2, g3, g4), nobs.a = nrow(dfA), nobs.b = nrow(dfB))
## Run EM
em <- emlinkMAR(tc)
## Get matches
ml <- matchesLink(list(g1, g2, g3, g4), nobs.a = nrow(dfA), nobs.b = nrow(dfB),
em = em, thresh = .95)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.