Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/match.data.frame.R
For each row of x[, by.x], find the best matching row of
y[, by.y], with the best match defined by grep. and
split.
grep. and split must either be missing or
have the same length as by.x and by.y.  If
grep.[i] and split[i] are NA, do a complete match of
x[, by.x[i]] and y[, by.y[i]].  Otherwise, for each row
j, look for a match for strsplit(x[j, by.x[i]],
    split[i])[[1]][1] among strsplit(y[, by.y[i]], split[i]).
See details.
1  | match.data.frame(x, y, by, by.x=by, by.y=by, grep., split, sep=':')
 | 
x, y | 
 data.frames  | 
by, by.x, by.y | 
 names of columns of   | 
grep. | 
 a character vector of the type of match for each element of
 Alternatives are  NOTE:  These alternatives are not examined if a unique match is
found betweed x[, by.x[is.na(grep.) & is.na(split)]] and the
corresponding columns of   | 
split | 
 A character vector of   | 
sep | 
 a   | 
1. Check by.x, by.y, grep. and split. If((missing(by.x) | missing(by.y)) && missing(by)) by <- names(x)
2.  fullMatch <- (is.na(grep.) & is.na(split)).  Create keyfx and
keyfy by by pasting columns of x[, by.x[fullMatch]] and y[,
by.y[fullMatch]].  Also create x. and y. = strsplit of
x[, by.x[!fullMatch]].
3.  Iterate over rows of x looking for the best match.  This
includes an inner loop over columns of x[, by.x[!fullMatch]], stopping
on the first unique match.  Return (-1) if no unique match is found.
an integer vector of length nrow(x) containing the index of the best
matching row of y or NA if no adequate match was found.
Spencer Graves
strsplit, is.na
grep, agrep
match, row.match,
join, match_df
classify
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  | newdata <- data.frame(state=c("AL", "MI","NY"),
                      surname=c("Rogers", "Rogers", "Smith"),
                      givenName=c("Mike R.", "Mike K.", "Al"),
                      stringsAsFactors=FALSE)
reference <- data.frame(state=c("NY", "NY", "MI", "AL", "NY", "MI"),
                      surname=c("Smith", "Rogers", "Rogers (MI)",
                                "Rogers (AL)", "Smith", 'Jones'),
                      givenName=c("John", "Mike", "Mike", "Mike",
                                "T. Albert", 'Al Thomas'),
                      stringsAsFactors=FALSE)
newInRef <- match.data.frame(newdata, reference,
       grep.=c(NA, 'agrep', 'agrep'))
all.equal(newInRef, c(4, 3, 5))
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.