| multi_join | R Documentation |
Join two or more data frames together in one operation. multi_join() can handle
multiple different join methods and can join on differently named variables.
multi_join(
data_frames,
on,
how = "left",
keep_indicators = FALSE,
monitor = FALSE
)
data_frames |
A list of data frames to join together. The second and all following data frames will be joined on the first one. |
on |
The key variables on which the data frames should be joined. If a character vector is provided, the function assumes all the variables are in every data frame. To join on different variable names a list of character vectors has to be provided. |
how |
A character vector containing the join method names. Available methods are: left, right, inner, full, outer, left_inner and right_inner. |
keep_indicators |
FALSE by default. If TRUE, a variable for each data frame is created, which indicates whether a data frame provides values. |
monitor |
FALSE by default. If TRUE, outputs two charts to visualize the functions time consumption. |
multi_join() is based on the 'SAS' Data-Step function Merge. Merge is capable of
joining multiple data sets together at once, with a very basic syntax.
Provide the dataset names, the variables, on which they should be joined and after a full join is complete, the user can decide which parts of the joins should remain in the final dataset.
multi_join() tries to keep the simplicity, while giving the user the power, to
do more joins at the same time. Additionally to what Merge can do, this function
also makes use of the Proc SQL possibility to join datasets on different variable
names.
Returns a single data frame with joined variables from all given data frames.
# Example data frames
df1 <- data.frame(key = c(1, 1, 1, 2, 2, 2),
a = c("a", "a", "a", "a", "a", "a"))
df2 <- data.frame(key = c(2, 3),
b = c("b", "b"))
# See all different joins in action
join_methods <- c("left", "right", "inner", "full", "outer", "left_inner", "right_inner")
joined_data <- list()
for (method in seq_along(join_methods)){
joined_data[[method]] <- multi_join(list(df1, df2),
on = "key",
how = join_methods[[method]])
}
# Left join on more than one key
df1b <- data.frame(key1 = c(1, 1, 1, 2, 2, 2),
key2 = c("a", "a", "a", "a", "a", "a"),
a = c("a", "a", "a", "a", "a", "a"))
df2b <- data.frame(key1 = c(2, 3),
key2 = c("a", "a"),
b = c("b", "b"))
left_joined <- multi_join(list(df1b, df2b), on = c("key1", "key2"))
# Join more than two data frames
df3 <- data.frame(key = c(1, 2),
c = c("c", "c"))
multiple_joined <- multi_join(list(df1, df2, df3), on = "key")
# You can also use different methods for each join
multiple_joined2 <- multi_join(list(df1, df3, df2),
on = "key",
how = c("left", "right"))
# Joining on different variable names
df1c <- data.frame(key1 = c(1, 1, 1, 2, 2, 2),
key2 = c("a", "a", "a", "a", "a", "a"),
a = c("a", "a", "a", "a", "a", "a"))
df2c <- data.frame(var1 = c(2, 3),
var2 = c("a", "a"),
b = c("b", "b"))
df3c <- data.frame(any = c(1, 2),
name = c("a", "a"),
c = c("c", "c"))
multiple_joined3 <- multi_join(list(df1c, df2c, df3c),
on = list(df1c = c("key1", "key2"),
df2c = c("var1", "var2"),
df3c = c("any", "name")))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.