knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(joyn) library(data.table) x <- data.table(id = c(1, 4, 2, 3, NA), t = c(1L, 2L, 1L, 2L, NA), country = c(16, 12, 3, NA, 15)) y <- data.table(id = c(1, 2, 5, 6, 3), gdp = c(11L, 15L, 20L, 13L, 10L), country = 16:20)
This vignette will let you explore some additional features available in joyn
, through an example use case.
Suppose you want to join tables x
and y
, where the variable country is available in both. You could do one of five things:
If you don't use the argument by
, joyn
will consider country and id as key variables by default given that they are common between x
and y
.
# The variables with the same name, `id` and `country`, are used as key # variables. joyn(x = x, y = y)
Alternatively, you can specify to join by country
# Joining by country joyn(x = x, y = y, by = "country")
y
and don't bring it into the resulting tableThis the default if you did not include country as part of the key variables in argument by
.
joyn(x = x, y = y, by = "id")
Another possibility is to make use of the update_NAs
argument of joyn()
. This allows you to update the NAs values in variable country in table x
with the actual values of the matching observations in country from table y. In this case, actual values in country from table x will remain unchanged.
joyn(x = x, y = y, by = "id", update_NAs = TRUE)
You can also update all the values - both NAs and actual - in variable country of table x
with the actual values of the matching observations in country from y
. This is done by setting update_values = TRUE
.
Notice that the reportvar
allows you keep track of how the update worked. In this case, value update means that only the values that are different between country from x
and country from y
are updated.
However, let's consider other possible cases:
If, for the same matching observations, the values between the two country variables were the same, the reporting variable would report x & y instead (so you know that there is no update to make).
if there are NAs in country from y
, the actual values in x
will be unchanged, and you would see a not updated status in the reporting variable. Nevertheless, notice there is another way for you to bring country from y
to x
. This is done through the argument keep_y_in_x
(see 5. below ⬇️)
# Notice that only the value that are joyn(x = x, y = y, by = "id", update_values = TRUE)
Another available option is that of bringing the original variable country from y
into the resulting table, without using it to update the values in x
. In order to distinguish country from x
and country from y
, joyn
will assign a suffix to the variable's name: so that you will get country.y and country.x. All of this can be done specifying keep_common_vars = TRUE.
joyn(x = x, y = y, by = "id", keep_common_vars = TRUE)
In joyn
, you can also bring non common variables from y
into the resulting table. In fact you can specify them in y_vars_to_keep
, as shown in the example below:
# Keeping variable gdp joyn(x = x, y = y, by = "id", y_vars_to_keep = "gdp")
Notice that if you set y_vars_to_keep = FALSE
or y_vars_to_keep = NULL
, then joyn
won't bring any variable into the returning table.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.