gammaCKpar | R Documentation |
Field comparisons for string variables. Three possible agreement patterns are considered: 0 total disagreement, 1 partial agreement, 2 agreement. The distance between strings is calculated using a Jaro-Winkler distance.
gammaCKpar(matAp, matBp, n.cores, cut.a, cut.p, method, w)
matAp |
vector storing the comparison field in data set 1 |
matBp |
vector storing the comparison field in data set 2 |
n.cores |
Number of cores to parallelize over. Default is NULL. |
cut.a |
Lower bound for full match, ranging between 0 and 1. Default is 0.92 |
cut.p |
Lower bound for partial match, ranging between 0 and 1. Default is 0.88 |
method |
String distance method, options are: "jw" Jaro-Winkler (Default), "dl" Damerau-Levenshtein, "jaro" Jaro, and "lv" Edit |
w |
Parameter that describes the importance of the first characters of a string (only needed if method = "jw"). Default is .10 |
gammaCKpar
returns a list with the indices corresponding to each
matching pattern, which can be fed directly into tableCounts
and matchesLink
.
Ted Enamorado <ted.enamorado@gmail.com>, Ben Fifield <benfifield@gmail.com>, and Kosuke Imai
## Not run:
g1 <- gammaCKpar(dfA$firstname, dfB$lastname)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.