knitr::opts_chunk$set( comment = "#>", collapse = TRUE, warning = FALSE, message = FALSE )
R interface to jq, a JSON processor http://jqlang.github.io/jq/
jqr
makes it easy to process large amounts of json without having to
convert from json to R, or without using regular expressions. This
means that the eventual loading into R can be quicker.
The jq
command line examples from the jq tutorial work exactly the same in R!
library(curl) library(jqr) curl('https://api.github.com/repos/ropensci/jqr/commits?per_page=5') %>% jq('.[] | {message: .commit.message, name: .commit.committer.name}')
Try running some of the other examples.
Binary packages for OS-X or Windows can be installed directly from CRAN:
install.packages("jqr")
Installation from source on Linux or OSX requires libjq
. On Ubuntu 14.04 and 16.04 lower use libjq-dev from Launchpad:
sudo add-apt-repository -y ppa:cran/jq sudo apt-get update -q sudo apt-get install -y libjq-dev
More recent Debian or Ubuntu install libjq-dev directly from Universe:
sudo apt-get install -y libjq-dev
On Fedora we need jq-devel:
sudo yum install jq-devel ```` On __CentOS / RHEL__ we install [jq-devel](https://apps.fedoraproject.org/packages/jq-devel) via EPEL:
sudo yum install epel-release sudo yum install jq-devel
On __OS-X__ use [jq](https://github.com/Homebrew/homebrew-core/blob/master/Formula/jq.rb) from Homebrew:
brew install jq
On __Solaris__ we can have [libjq_dev](https://www.opencsw.org/packages/libjq_dev) from [OpenCSW](https://www.opencsw.org/):
pkgadd -d http://get.opencsw.org/now /opt/csw/bin/pkgutil -U /opt/csw/bin/pkgutil -y -i libjq_dev
```r library(jqr)
There's a low level interface in which you can execute jq
code just as you would on the command line:
str <- '[{ "foo": 1, "bar": 2 }, { "foo": 3, "bar": 4 }, { "foo": 5, "bar": 6 }]'
jq(str, ".[]")
jq(str, "[.[] | {name: .foo} | keys]")
Note that we print the output to look like a valid JSON object to make it
easier to look at. However, it's a simple character string or vector of strings.
A trick you can do is to wrap your jq program in brackets like [.[]]
instead
of .[]
, e.g.,
jq(str, ".[]") %>% unclass # vs. jq(str, "[.[]]") %>% unclass
Combine many jq arguments - they are internally combined with a pipe |
(note how these are identical)
jq(str, ".[] | {name: .foo} | keys") jq(str, ".[]", "{name: .foo}", "keys")
Also accepts many JSON inputs now
jq("[123, 456] [77, 88, 99]", ".[]") jq('{"foo": 77} {"bar": 45}', ".[]") jq('[{"foo": 77, "stuff": "things"}] [{"bar": 45}] [{"n": 5}]', ".[] | keys") # if you have jsons in a vector jsons <- c('[{"foo": 77, "stuff": "things"}]', '[{"bar": 45}]', '[{"n": 5}]') jq(paste0(jsons, collapse = " "), ".[]")
The other is higher level, and uses a suite of functions to construct queries. Queries are constucted, then excuted internally with jq()
after the last piped command.
You don't have to use pipes though. See examples below.
Examples:
Index
x <- '[{"message": "hello", "name": "jenn"}, {"message": "world", "name": "beth"}]' x %>% index()
Sort
'[8,3,null,6]' %>% sortj
reverse order
'[1,2,3,4]' %>% reverse
Show the query to be used using peek()
'[1,2,3,4]' %>% reverse %>% peek
x <- '{"user":"jqlang","titles":["JQ Primer", "More JQ"]}' jq(x, '{user, title: .titles[]}') x %>% index() x %>% build_object(user, title = `.titles[]`) jq(x, '{user, title: .titles[]}') %>% jsonlite::toJSON() %>% jsonlite::validate()
join
'["a","b,c,d","e"]' %>% join '["a","b,c,d","e"]' %>% join(`;`)
ltrimstr
'["fo", "foo", "barfoo", "foobar", "afoo"]' %>% index() %>% ltrimstr(foo)
rtrimstr
'["fo", "foo", "barfoo", "foobar", "foob"]' %>% index() %>% rtrimstr(foo)
startswith
'["fo", "foo", "barfoo", "foobar", "barfoob"]' %>% index %>% startswith(foo) '["fo", "foo"] ["barfoo", "foobar", "barfoob"]' %>% index %>% startswith(foo)
endswith
'["fo", "foo", "barfoo", "foobar", "barfoob"]' %>% index %>% endswith(foo)
tojson, fromjson, tostring
'[1, "foo", ["foo"]]' %>% index '[1, "foo", ["foo"]]' %>% index %>% tostring '[1, "foo", ["foo"]]' %>% index %>% tojson '[1, "foo", ["foo"]]' %>% index %>% tojson %>% fromjson
contains
'"foobar"' %>% contains("bar")
unique
'[1,2,5,3,5,3,1,3]' %>% uniquej
With filtering via select()
you can use various operators, like ==
,
&&
, ||
. We translate these internally for you to what jq
wants
to see (==
, and
, or
).
Simple, one condition
'{"foo": 4, "bar": 7}' %>% select(.foo == 4)
More complicated. Combine more than one condition; combine each individual filtering task in parentheses
x <- '{"foo": 4, "bar": 2} {"foo": 5, "bar": 4} {"foo": 8, "bar": 12}' x %>% select((.foo < 6) && (.bar > 3)) x %>% select((.foo < 6) || (.bar > 3))
get type information for each element
'[0, false, [], {}, null, "hello"]' %>% types '[0, false, [], {}, null, "hello", true, [1,2,3]]' %>% types
select elements by type
'[0, false, [], {}, null, "hello"]' %>% index() %>% type(booleans)
get keys
str <- '{"foo": 5, "bar": 7}' str %>% keys()
delete by key name
str %>% del(bar)
check for key existence
str3 <- '[[0,1], ["a","b","c"]]' str3 %>% haskey(2) str3 %>% haskey(1,2)
Build an object, selecting variables by name, and rename
'{"foo": 5, "bar": 7}' %>% build_object(a = .foo)
More complicated build_object()
, using the included dataset commits
commits %>% index() %>% build_object(sha = .sha, name = .commit.committer.name)
'{"a": 7}' %>% do(.a + 1) '{"a": [1,2], "b": [3,4]}' %>% do(.a + .b) '{"a": [1,2], "b": [3,4]}' %>% do(.a - .b) '{"a": 3}' %>% do(4 - .a) '["xml", "yaml", "json"]' %>% do('. - ["xml", "yaml"]') '5' %>% do(10 / . * 3)
comparisons
'[5,4,2,7]' %>% index() %>% do(. < 4) '[5,4,2,7]' %>% index() %>% do(. > 4) '[5,4,2,7]' %>% index() %>% do(. <= 4) '[5,4,2,7]' %>% index() %>% do(. >= 4) '[5,4,2,7]' %>% index() %>% do(. == 4) '[5,4,2,7]' %>% index() %>% do(. != 4)
length
'[[1,2], "string", {"a":2}, null]' %>% index %>% lengthj
sqrt
'9' %>% sqrtj
floor
'3.14159' %>% floorj
find minimum
'[5,4,2,7]' %>% minj '[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% minj '[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% minj(foo) '[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% minj(bar)
find maximum
'[5,4,2,7]' %>% maxj '[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% maxj '[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% maxj(foo) '[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% maxj(bar)
jq
sometimes creates pieces of JSON that are valid in themselves, but together are not.
combine()
is a way to make valid JSON.
This outputs a few pieces of JSON
(x <- commits %>% index() %>% build_object(sha = .sha, name = .commit.committer.name))
Use combine()
to put them together.
combine(x)
jqr
in R doing citation(package = 'jqr')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.