fuzzyMerge: Fuzzy Matching for Merging Data Frames
In mnblonsky/REMI: REMI: EMI Consulting's Internal R Package

Description Usage Arguments Value See Also

Merges two data frames using one shared column. Left merges only! Direct matches are checked first, followed by multiple sets of fuzzy matches. A random match is chosen if multiple values match.

1
2
3

fuzzyMerge(dfX, dfY, by = intersect(names(dfX), names(dfY))[1], byX = by,
  byY = by, costs = list(ins = 2, del = 1, sub = 3), distance = c(0, 1, 2,
  3, 5, 7, 10, 15, 20), keepOriginal = FALSE, ...)

`dfX`	first data frame to match. The returned data frame will have the same number of rows as this data frame.
`dfY`	second data frame to match. Note: there should be no duplicates in the matching column in this data frame!
`by`	column name (or number) in data frames to use for matching. Can only be one column! By default, it is the first matching column name in dfX and dfY
`byX`	column name in dfX if column names are different
`byY`	column name in dfY if column names are different
`costs`	The costs associated with string changes. See agrep for details
`distance`	vector of maximum distances for fuzzy matching. See agrep for details. Length corresponds to the number of matching iterations.
`keepOriginal`	if True, adds the column "Original" in the final data frame which contains vector.
`...`	parameters sent to agrep for fuzzy matching