knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Using msafer to locate errors when applying map()
to a vector.
library(msafer)
For demonstration purposes, let's create a sample list of dataframes from the starwars
and mtcars
datasets.
sample_a <- dplyr::sample_n(dplyr::starwars, 34) sample_a <- subset(sample_a, select = -c(height,hair_color)) sample_b <- dplyr::sample_n(dplyr::starwars, 35) sample_c <- dplyr::sample_n(mtcars, 20) sample_c <- subset(sample_c, select = -c(hp)) sample_list <- list(dplyr::starwars, sample_a, sample_b, sample_c, mtcars)
Let's attempt to use map()
to use a function on all the dataframes in this list.
purrr::map(sample_list, dplyr::select, height)
Uh oh! Something didn't work - but what exactly? And Where exactly?
map_safe_merge()
to the rescue! Pass map_safe_merge()
the same arguments as map()
: a vector, a function, and parameters that the function needs (if any). map_safe_merge()
will return a tibble with the file numbers and any errors that may have occurred while trying to apply map()
.
map_safe_merge(sample_list, dplyr::select, height)
map_safe()
is a even better option. Since map_safe_merge()
only outputs a tibble in the order it was generated, it's hard to quickly identify the error in a huge vector. map_safe()
nests the error message, and returns a tibble that contains only the unique error message and the index locating where the error occurs within the vector.
df <- map_safe(sample_list, dplyr::select, height) df
The column which_id
in the tibble generated by map_safe()
is a list of tibbles that contains the indices of the elements related to the result. To show it or compute with it, use the following method:
Which_id is always the 3rd column in the tibble generated by map_safe()
.
df[[3]][[1]]
The result shows that in a list which contains 5 datasets, the first and the third dataset contains column height
, whereas the second, 4th and fifth dataset does not, causing the error "Error in .f(.x[[i]], ...): object 'height' not found".
You can also pass in just one dataframe. map_safe_merge()
will return whether or not the specified function can be applied to each column in the dataframe.
map_safe_merge(iris, log)
And map_safe()
will combine the errors together.
map_safe(iris, log)
If you pass in one vector, map_safe
will return whether or not the specified function can be applied to each row.
map_safe(iris, log)
map_safe(iris$Sepal.Length, log)
map_safe()
and map_safe_merge()
can be used based on the user's preference and how they want to use the output.
Another function within the msafer package is the check_match()
function, which identifies whether the user’s requirement existed within the dataset. If it exists, then the function will return true, if it does not, it returns false.
# when working with one dataframe check_match(dplyr::starwars, hair_color == "brown")
check_match(dplyr::starwars, height == 0)
We can see that starwars
does contain a character with brown hair color, but there's no character with a height of 0.
You can also use map_safe()
in conjunction with check_match()
.
map_safe(sample_list, check_match, height==0)
The flagship function of msafer, map_safe()
, can identify on which files errors occur when applying map()
to a vector.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.