jqr

knitr::opts_chunk$set(
  comment = "#>",
  collapse = TRUE,
  warning = FALSE,
  message = FALSE
)

R-CMD-check codecov cran checks rstudio mirror downloads cran version

R interface to jq, a JSON processor http://jqlang.github.io/jq/

jqr makes it easy to process large amounts of json without having to convert from json to R, or without using regular expressions. This means that the eventual loading into R can be quicker.

Quickstart Tutorial

The jq command line examples from the jq tutorial work exactly the same in R!

library(curl)
library(jqr)
curl('https://api.github.com/repos/ropensci/jqr/commits?per_page=5') %>%
  jq('.[] | {message: .commit.message, name: .commit.committer.name}')

Try running some of the other examples.

Installation

Binary packages for OS-X or Windows can be installed directly from CRAN:

install.packages("jqr")

Installation from source on Linux or OSX requires libjq. On Ubuntu 14.04 and 16.04 lower use libjq-dev from Launchpad:

sudo add-apt-repository -y ppa:cran/jq
sudo apt-get update -q
sudo apt-get install -y libjq-dev

More recent Debian or Ubuntu install libjq-dev directly from Universe:

sudo apt-get install -y libjq-dev

On Fedora we need jq-devel:

sudo yum install jq-devel
````

On __CentOS / RHEL__ we install [jq-devel](https://apps.fedoraproject.org/packages/jq-devel) via EPEL:

sudo yum install epel-release sudo yum install jq-devel

On __OS-X__ use [jq](https://github.com/Homebrew/homebrew-core/blob/master/Formula/jq.rb) from Homebrew:

brew install jq

On __Solaris__ we can have [libjq_dev](https://www.opencsw.org/packages/libjq_dev) from [OpenCSW](https://www.opencsw.org/):

pkgadd -d http://get.opencsw.org/now /opt/csw/bin/pkgutil -U /opt/csw/bin/pkgutil -y -i libjq_dev

```r
library(jqr)

Interfaces

low level

There's a low level interface in which you can execute jq code just as you would on the command line:

str <- '[{
    "foo": 1,
    "bar": 2
  },
  {
    "foo": 3,
    "bar": 4
  },
  {
    "foo": 5,
    "bar": 6
}]'
jq(str, ".[]")
jq(str, "[.[] | {name: .foo} | keys]")

Note that we print the output to look like a valid JSON object to make it easier to look at. However, it's a simple character string or vector of strings. A trick you can do is to wrap your jq program in brackets like [.[]] instead of .[], e.g.,

jq(str, ".[]") %>% unclass
# vs.
jq(str, "[.[]]") %>% unclass

Combine many jq arguments - they are internally combined with a pipe |

(note how these are identical)

jq(str, ".[] | {name: .foo} | keys")
jq(str, ".[]", "{name: .foo}", "keys")

Also accepts many JSON inputs now

jq("[123, 456]   [77, 88, 99]", ".[]")
jq('{"foo": 77} {"bar": 45}', ".[]")
jq('[{"foo": 77, "stuff": "things"}] [{"bar": 45}] [{"n": 5}]', ".[] | keys")

# if you have jsons in a vector
jsons <- c('[{"foo": 77, "stuff": "things"}]', '[{"bar": 45}]', '[{"n": 5}]')
jq(paste0(jsons, collapse = " "), ".[]")

high level

The other is higher level, and uses a suite of functions to construct queries. Queries are constucted, then excuted internally with jq() after the last piped command.

You don't have to use pipes though. See examples below.

Examples:

Index

x <- '[{"message": "hello", "name": "jenn"}, {"message": "world", "name": "beth"}]'
x %>% index()

Sort

'[8,3,null,6]' %>% sortj

reverse order

'[1,2,3,4]' %>% reverse

Show the query to be used using peek()

'[1,2,3,4]' %>% reverse %>% peek

get multiple outputs for array w/ > 1 element

x <- '{"user":"jqlang","titles":["JQ Primer", "More JQ"]}'
jq(x, '{user, title: .titles[]}')
x %>% index()
x %>% build_object(user, title = `.titles[]`)
jq(x, '{user, title: .titles[]}') %>% jsonlite::toJSON() %>% jsonlite::validate()

string operations

join

'["a","b,c,d","e"]' %>% join
'["a","b,c,d","e"]' %>% join(`;`)

ltrimstr

'["fo", "foo", "barfoo", "foobar", "afoo"]' %>% index() %>% ltrimstr(foo)

rtrimstr

'["fo", "foo", "barfoo", "foobar", "foob"]' %>% index() %>% rtrimstr(foo)

startswith

'["fo", "foo", "barfoo", "foobar", "barfoob"]' %>% index %>% startswith(foo)
'["fo", "foo"] ["barfoo", "foobar", "barfoob"]' %>% index %>% startswith(foo)

endswith

'["fo", "foo", "barfoo", "foobar", "barfoob"]' %>% index %>% endswith(foo)

tojson, fromjson, tostring

'[1, "foo", ["foo"]]' %>% index
'[1, "foo", ["foo"]]' %>% index %>% tostring
'[1, "foo", ["foo"]]' %>% index %>% tojson
'[1, "foo", ["foo"]]' %>% index %>% tojson %>% fromjson

contains

'"foobar"' %>% contains("bar")

unique

'[1,2,5,3,5,3,1,3]' %>% uniquej

filter

With filtering via select() you can use various operators, like ==, &&, ||. We translate these internally for you to what jq wants to see (==, and, or).

Simple, one condition

'{"foo": 4, "bar": 7}' %>% select(.foo == 4)

More complicated. Combine more than one condition; combine each individual filtering task in parentheses

x <- '{"foo": 4, "bar": 2} {"foo": 5, "bar": 4} {"foo": 8, "bar": 12}'
x %>% select((.foo < 6) && (.bar > 3))
x %>% select((.foo < 6) || (.bar > 3))

types

get type information for each element

'[0, false, [], {}, null, "hello"]' %>% types
'[0, false, [], {}, null, "hello", true, [1,2,3]]' %>% types

select elements by type

'[0, false, [], {}, null, "hello"]' %>% index() %>% type(booleans)

key operations

get keys

str <- '{"foo": 5, "bar": 7}'
str %>% keys()

delete by key name

str %>% del(bar)

check for key existence

str3 <- '[[0,1], ["a","b","c"]]'
str3 %>% haskey(2)
str3 %>% haskey(1,2)

Build an object, selecting variables by name, and rename

'{"foo": 5, "bar": 7}' %>% build_object(a = .foo)

More complicated build_object(), using the included dataset commits

commits %>%
  index() %>%
  build_object(sha = .sha, name = .commit.committer.name)

Maths

'{"a": 7}' %>%  do(.a + 1)
'{"a": [1,2], "b": [3,4]}' %>%  do(.a + .b)
'{"a": [1,2], "b": [3,4]}' %>%  do(.a - .b)
'{"a": 3}' %>%  do(4 - .a)
'["xml", "yaml", "json"]' %>%  do('. - ["xml", "yaml"]')
'5' %>%  do(10 / . * 3)

comparisons

'[5,4,2,7]' %>% index() %>% do(. < 4)
'[5,4,2,7]' %>% index() %>% do(. > 4)
'[5,4,2,7]' %>% index() %>% do(. <= 4)
'[5,4,2,7]' %>% index() %>% do(. >= 4)
'[5,4,2,7]' %>% index() %>% do(. == 4)
'[5,4,2,7]' %>% index() %>% do(. != 4)

length

'[[1,2], "string", {"a":2}, null]' %>% index %>% lengthj

sqrt

'9' %>% sqrtj

floor

'3.14159' %>% floorj

find minimum

'[5,4,2,7]' %>% minj
'[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% minj
'[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% minj(foo)
'[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% minj(bar)

find maximum

'[5,4,2,7]' %>% maxj
'[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% maxj
'[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% maxj(foo)
'[{"foo":1, "bar":14}, {"foo":2, "bar":3}]' %>% maxj(bar)

Combine into valid JSON

jq sometimes creates pieces of JSON that are valid in themselves, but together are not. combine() is a way to make valid JSON.

This outputs a few pieces of JSON

(x <- commits %>%
  index() %>%
  build_object(sha = .sha, name = .commit.committer.name))

Use combine() to put them together.

combine(x)

Meta

rofooter



ropensci/jqr documentation built on Jan. 19, 2024, 8:33 p.m.