Joins and Relates Toolset {#join}

For this chapter, you will need the following R Packages:

library(dplyr)

In ArcGIS, joining two datasets on a common attribute is done with tools from the "Join" Toolset. Similar as described with the Selection tools in chapter \@ref(chapter-selection), a join has a temporary / floating nature and does not automatically produce an output. In R, joining two datasets is only persistent if the output is assigned to a new variable^[or piped into a new function].

In R we have two main functions for joining two datasets. On the one hand there is the base R function merge, on the other hand there is the *_join family that lie within the dplyr package. Since the latter family of functions are very close to how Joins are done in SQL, we will use the latter case for our examples below.

Before we begin with our examples, we have to make clear the differences among the various forms of join operations.

knitr::include_graphics("images/joins.png")

Inner join {#join-inner}

Inner Join in R is the most common type of join. It is an operation that returns the rows when the matching condition is fulfilled. Below we demonstrate it with an example.

df1 <- data.frame(TeamID = c(1,4,6,11),
                  TeamName = c("new york knicks","los angeles lakers","milwaukee bucks","boston celtics"),
                  Championships = c(2,17,1,17))

df2 <- data.frame(TeamID = c(1,2,11,8),
                  TeamName = c("new york knicks","philadelphia 76ers","boston celtics","los angeles clippers"),
                  Championships = c(2,3,17,0))

df1

df2
df1 %>% inner_join(df2)

Outer join {#join-outer}

Outer join in R using simply returns all rows from both data frames. This is very well depicted in figure \@ref(fig:joins).

full_join(df1,df2)

Left / Right join {#join-left}

The left join in R returns all records from the data frame on the left, as well as and the matched records from the one at the right.

left_join(df1,df2)

Similarly works also the right join.

right_join(df1,df2)


arc2r/book documentation built on March 5, 2021, 2:10 p.m.