matchesLink: matchesLink

View source: R/matchesLink.R

matchesLinkR Documentation

matchesLink

Description

matchesLink produces two dataframes that store all the pairs that share a pattern that conforms to the an interval of the Fellegi-Sunter weights

Usage

matchesLink(gammalist, nobs.a, nobs.b, em, thresh, n.cores = NULL)

Arguments

gammalist

A list of objects produced by either gammaKpar or gammaCKpar.

nobs.a

number of observations in dataset 1

nobs.b

number of observations in dataset 2

em

parameters obtained from the Expectation-Maximization algorithm under the MAR assumption. These estimates are produced by emlinkMARmov

thresh

is the interval of posterior zeta values for the agreements that we want to examine closer. Ranges between 0 and 1. Can be a vector of length 1 (from specified value to 1) or 2 (from first specified value to second specified value).

n.cores

Number of cores to parallelize over. Default is NULL.

Value

matchesLink returns an nmatches X 2 matrix with the indices of the matches rows in dataset A and dataset B.

Author(s)

Ted Enamorado <ted.enamorado@gmail.com>, Ben Fifield <benfifield@gmail.com>, and Kosuke Imai

Examples

## Not run: 
## Calculate gammas
g1 <- gammaCKpar(dfA$firstname, dfB$firstname)
g2 <- gammaCKpar(dfA$middlename, dfB$middlename)
g3 <- gammaCKpar(dfA$lastname, dfB$lastname)
g4 <- gammaKpar(dfA$birthyear, dfB$birthyear)

## Run tableCounts
tc <- tableCounts(list(g1, g2, g3, g4), nobs.a = nrow(dfA), nobs.b = nrow(dfB))

## Run EM
em <- emlinkMAR(tc)

## Get matches
ml <- matchesLink(list(g1, g2, g3, g4), nobs.a = nrow(dfA), nobs.b = nrow(dfB),
em = em, thresh = .95)

## End(Not run)


kosukeimai/fastLink documentation built on Nov. 17, 2023, 8:11 p.m.