library(dplyr) knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Mapping values in one column to specific values in another (new) column
of a data frame is a common task in data science. Doing so safely is
often a struggle. There are some existing methods in the tidyverse
that are useful, but in my opinion come with some drawbacks:
dplyr::recode()
dplyr::case_when()
Below is what I see is a safe way to map (re-code) values in an existing column to a new column.
# wish to map values of 'x' df <- withr::with_seed(101, { data.frame(id = 1:10L, value = rnorm(10), x = sample(letters[1:3L], 10, replace = TRUE) ) }) df # create a [n x 2] lookup-table (aka hash map) # n = no. values to map # x = existing values to map # new_x = new mapped values for each `x` map <- data.frame(x = letters[1:4L], new_x = c("cat", "dog", "bird", "turtle")) map # use `dplyr::left_join()` # note: 'turtle' is absent because `d` is not in `df$x` (thus ignored) dplyr::left_join(df, map)
NAs
Notice that b
maps to NA
. This is because the mapping object now
lacks a mapping for b
(compare to row 2 above).
Using a slightly different syntax via tibble::enframe()
.
# note: `b` is missing in the map map_vec <- c(a = "cat", c = "bird", d = "turtle") map2 <- tibble::enframe(map_vec, name = "x", value = "new_x") map2 # note: un-mapped values generate NAs: `b -> NA` dplyr::left_join(df, map2, by = "x")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.