jaroWinkler: Calculate Jaro-Winkler distance

Description Usage Arguments Details Value Author(s) References Examples

View source: R/jaroWinkler.R

Description

Calculate Jaro-Winkler distance for two strings str_1 and str_2

Usage

1
jaroWinkler(str_1, str_2, weight_threshold = 0.7, num_chars = 4)

Arguments

str_1

first string for calculating the distance

str_2

second string for calculating the distance

weight_threshold

percent to apply the Winkler modification

num_chars

size of the prefix to be concidered by the Winkler modification

Details

The Jaro-Winkler distance metric is a string edit distance. The Jaro-Winkler distance metric is designed and best suited for short strings such as person names. The score is normalized such that 0 equates to no similarity and 1 is an exact match.

Value

the Jaro-Winkler distance between the specified strings

Author(s)

Daniel Rodriguez Perez

References

M. A. Jaro, "Advances in record linkage methodology as applied to the 1985 census of Tampa Florida." Journal of the American Statistical Association, vol. 84, no. 406, pp. 414-420, Jun. 1989.

W. E. Winkler, "String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage." Proceedings of the Section on Survey Research Methods (American Statistical Association), pp. 354-359, 1990.

M. A. Jaro, "Probabilistic linkage of large public health data file." Statistics in Medicine vol. 14, no. 5-7, pp. 491-498, March - April 1995.

Examples

1
jaroWinkler('saturday', 'sunday')

drodriguezperez/financialsanctions documentation built on May 17, 2019, 2:42 p.m.