knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(joyn) library(data.table) x <- data.table(id = c(1, 4, 2, 3, NA), t = c(1L, 2L, 1L, 2L, NA), country = c(16, 12, 3, NA, 15)) y <- data.table(id = c(1, 2, 5, 6, 3), gdp = c(11L, 15L, 20L, 13L, 10L), country = 16:20)
This vignette will let you explore some additional features available in joyn
, through an example use case.
Suppose you want to join tables x
and y
, where the variable country is available in both. You could do one of five things:
If you don't use the argument by
, joyn
will consider country and id as key variables by default given that they are common between x
and y
.
# The variables with the same name, `id` and `country`, are used as key # variables. joyn(x = x, y = y)
Alternatively, you can specify to join by country
# Joining by country joyn(x = x, y = y, by = "country")
y
and don't bring it into the resulting tableThis the default if you did not include country as part of the key variables in argument by
.
joyn(x = x, y = y, by = "id")
Another possibility is to make use of the update_NAs
argument of joyn()
. This allows you to update the NAs values in variable country in table x
with the actual values of the matching observations in country from table y. In this case, actual values in country from table x will remain unchanged.
joyn(x = x, y = y, by = "id", update_NAs = TRUE)
You can also update all the values - both NAs and actual - in variable country of table x
with the actual values of the matching observations in country from y
. This is done by setting update_values = TRUE
.
Notice that the reportvar
allows you keep track of how the update worked. In this case, value update means that only the values that are different between country from x
and country from y
are updated.
However, let's consider other possible cases:
If, for the same matching observations, the values between the two country variables were the same, the reporting variable would report x & y instead (so you know that there is no update to make).
if there are NAs in country from y
, the actual values in x
will be unchanged, and you would see a not updated status in the reporting variable. Nevertheless, notice there is another way for you to bring country from y
to x
. This is done through the argument keep_y_in_x
(see 5. below ⬇️)
# Notice that only the value that are joyn(x = x, y = y, by = "id", update_values = TRUE)
Another available option is that of bringing the original variable country from y
into the resulting table, without using it to update the values in x
. In order to distinguish country from x
and country from y
, joyn
will assign a suffix to the variable's name: so that you will get country.y and country.x. All of this can be done specifying keep_common_vars = TRUE.
joyn(x = x, y = y, by = "id", keep_common_vars = TRUE)
In joyn
, you can also bring non common variables from y
into the resulting table. In fact you can specify them in y_vars_to_keep
, as shown in the example below:
# Keeping variable gdp joyn(x = x, y = y, by = "id", y_vars_to_keep = "gdp")
Notice that if you set y_vars_to_keep = FALSE
or y_vars_to_keep = NULL
, then joyn
won't bring any variable into the returning table.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.