knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(knitr) library(dplyr) library(tidyr) library(lychee) elec94 <- filter(greens3, year==1994) %>% select(-year,-city_clean) elec09 <- filter(greens3, year==2009) %>% select(-year,-city_clean)
The package lychee
helps to link and join data frames with key variables that are are similiar but not identical (e.g., a variable with geographic names spelled slightly different or nearby geographic coordinates). Different from the fuzzyjoin
package, the package lychee
does not output all matches given some definition of sufficient similarity, but constructs optimal one-to-one matches minimizing the total difference across all matches.
The function linkr()
stacks two data frames and finds an optimal one-to-one pairing of rows in one data frame with rows in the other data frame. The output is a data frame with as many rows as there are in the two datasets and a common identifier for the pairs. The complementary function is joinr()
which, instead of stacking and assigning a common identifier, joins two data frames similar to the base::merge()
or dplyr::full_join()
function.
# Install development version from GitHub remotes::install_github("sumtxt/lychee")
For more details and to learn how to use this package: Getting Started with lychee.
The example below shows how joinr
finds optimal matches in two data frames (elec94
and elec09
) within two groups. The first data frame (elec94
) lists the strongholds of Germany's green party in the 1994 Federal election (election=BTW
) and the 1994 European election (election=EP
). The second data frame lists such strongholds for the 2009 elections. joinr
merges the two data frames correctly even though the city names are spelled slightly different in the two data frames preventing to merge them via base::merge()
or dplyr::full_join()
.
elec94 elec09 joinr(elec94,elec09, strata="election", by="city", suffix=c("94","09"), add_distance=TRUE, caliper=12, method='lcs', full=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.