Combining datasets {.build}

There are many times when you have two or more overlapping datasets that you would like to combine

The dplyr package has a number of *_join functions for this purpose

url <- "https://github.com/matthewhirschey/tidybiology-plusds/raw/master/media/dplyr-joins.png"
knitr::include_graphics(url)

left_join {.build}

Returns all rows from a, and all columns from a and b

Rows in a with no match in b will have NA values in the new columns

If there are multiple matches between a and b, all combinations of the matches are returned

left_join example {.build}

Take a look at the variables in each dataset - r dataframe_name and r dataframe_join_name

You will notice that both datasets contain common variable - r df_id_name. This can therefore serve as a common variable to join on. Let's join on this:

left_join r dataframe_name with r dataframe_join_name and assign the output to a new object called `r dataframe_name_join_left`

Go to code/
Open 08_import_and_join.Rmd
Complete the exercise to join the two datasets.

Now you have one dataset with additional useful information

right_join

url <- "https://github.com/matthewhirschey/tidybiology-plusds/raw/master/media/dplyr-joins.png"
knitr::include_graphics(url)

right_join {.build}

Returns all rows from b, and all columns from a and b

Rows in b with no match in a will have NA values in the new columns

If there are multiple matches between a and b, all combinations of the matches are returned

This is conceptually equivalent to a left_join, but can be useful when stringing together multiple steps using %>%

inner_join

url <- "https://github.com/matthewhirschey/tidybiology-plusds/raw/master/media/dplyr-joins.png"
knitr::include_graphics(url)

inner_join {.build}

Returns all rows from a where there are matching values in b, and all columns from a and b

If there are multiple matches between a and b, all combination of the matches are returned

full_join

url <- "https://github.com/matthewhirschey/tidybiology-plusds/raw/master/media/dplyr-joins.png"
knitr::include_graphics(url)

full_join {.build}

Returns all rows and all columns from both a and b

Where there are no matching values, returns NA for the one missing



matthewhirschey/bespokelearnr documentation built on Oct. 11, 2020, 12:57 a.m.