knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
There are already a couple of packages to parse JSON (e.g. rjson
or jsonlite
). But for me they all had some drawbacks when working with JSON columns from a database:
RJSONIO
and rjson
silently use only the first element of the input; jsonlite
collapses the input, producing invalid JSON and then complains about trailing garbage.NA
and NULL
input produce unclear errors and one cannot control how to handle them.jsontools
relies on jsonlite
for parsing from JSON and conversion to JSON. For parsing it has the following functions:
parse_json()
to parse a character vector of length 1 (i.e. a scalar).parse_json_vector()
to parse a character vector of any length.As an example assume we got the following JSON vector
library(jsontools) customer_infos <- json2(c( '{"name": "Peter", "age": 19, "premium": true}', '{"name": "Daniel", "age": 41}', NA, '{"name": "Pablo", "age": 27, "premium": false}' )) customer_infos
By default the NA
produces an error
parse_json_vector(customer_infos)
but with the .na
argument we can specify which value should be used instead:
parse_json_vector(customer_infos, .na = NULL) %>% str()
Note that the default for simplifyDataFrame
and simplifyMatrix
has been changed to FALSE
.
While they can be quite nice for interactive parsing of JSON they do not allow you to control how the result should look like.
For a more controlled conversion of the resulting list to a data frame have a look at the tibblify package
.
For conversion there are the following functions:
format_json()
: convert an R object to JSON.format_json_vector()
: convert each element of an R object to JSON and return them as a JSON vector.format_json_rowwise()
: convert each row of a data frame to JSON.write_json()
: convert an R object to JSON and write it to disk.Let's look at our customer informations from above. Assume we have converted it into a tibble:
customer_infos_df <- tibble::tibble( name = c("Peter", "Daniel", NA, "Pablo"), age = c(19L, 41L, NA, 27L), premium = c(TRUE, NA, NA, FALSE) )
and filled the missing fields:
customer_infos_df[3, ] <- tibble::tibble( name = "Michael", age = 38, premium = FALSE ) customer_infos_df$premium[2] <- TRUE customer_infos_df
We can now easily convert it back into a JSON vector
format_json_rowwise(customer_infos_df)
Compare this to format_json()
which turns the data frame into an array of
objects:
format_json(customer_infos_df)
To format every element of a list to JSON instead of the whole list at once use
format_json_list()
:
customer_infos_list <- parse_json_vector(customer_infos, .na = NULL) str(customer_infos_list) format_json_list(customer_infos_list)
Notice that each value is wrapped in a JSON array? If you do not want this you have three possibilities:
json_u()
on the elements you do not want in an array.auto_unbox = TRUE
to only wrap elements of length > 1 in an array.While auto_unbox = TRUE
is quite handy you should only use it in non-interactive scripts as otherwise the output JSON might change the format.
For example we a value customers
in a list. Usually, customers is a vector of length > 1.
x <- list(customers = 1:2) format_json(x, auto_unbox = TRUE)
But in some rare cases there is only a single customer
x <- list(customers = 1) format_json(x, auto_unbox = TRUE)
and instead of an array we get a scalar integer at the key "customers".
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.