Overview supported structures

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(tibblify)

Supported input for tibblify()

The idea of tibblify() is to make it easier and more robust to convert lists of lists into tibbles. This is a typical task after receiving API responses in JSON format. The following provides an overview which kind of R objects are supported and the JSON they correspond to.

Scalars

There are 4 basic types of scalars coming from JSON: boolean, integer, float, string. In R there are not really scalars but only vectors of length 1.

:::: {style="display: grid; grid-template-columns: 1fr 1fr; grid-column-gap: 10px; width: 100%;"}

::: {}

true
1
1.5
"a"

:::

::: {}

TRUE
1
1.5
"a"

:::

::::

Other R vectors without JSON equivalent are also supported as long as they:

Examples are Date or POSIXct.

In general a scalar can be parsed with tib_scalar(). There are some special functions for common types:

Vectors

A homogeneous JSON array is an array of scalar where each scalar has the same type. In R they correspond to a logical(), integer(), double() or character() vector:

:::: {style="display: grid; grid-template-columns: 1fr 1fr; grid-column-gap: 10px; width: 100%;"}

::: {}

[true, null, false]
[1, null, 3]
[1.5, null, 3.5]
["a", null, "c"]

:::

::: {}

c(TRUE, NA, FALSE)
c(1L, NA, 2L)
c(1.5, NA, 2.5)
c("a", NA, "c")

:::

::::

As for scalars other types are also supported as long as they are a vector in the vctrs definition.

They can be parsed with tib_vector(). As for scalars there are shortcuts for some common types, e.g. tib_lgl_vec().

Empty lists

A special case are empty lists list(). They might appear when parsing an empty JSON array:

x_json <- '[
  {"a": [1, 2]},
  {"a": []}
]'

x <- jsonlite::fromJSON(x_json, simplifyDataFrame = FALSE)
str(x)

By default they are not supported but produce an error:

tibblify(x, tspec_df(tib_int_vec("a")))

Use vector_allows_empty_list = TRUE in tspec_*() so that they are converted to an empty vector instead:

tibblify(x, tspec_df(tib_int_vec("a"), vector_allows_empty_list = TRUE))$a

Homogeneous R lists of scalars

When using jsonlite::fromJSON(simplifyVector = FALSE) to parse JSON to an R object one does not get R vectors but homogeneous lists of scalars:

x_json <- '[
  {"a": [1, 2]},
  {"a": [1, 2, 3]}
]'

x <- jsonlite::fromJSON(x_json, simplifyVector = FALSE)
str(x)

By default they cannot be parsed with tib_vector()

tibblify(x, tspec_df(tib_int_vec("a")))

Use input_form = "scalar_list" in tib_vector() to parse them:

tibblify(x, tspec_df(tib_int_vec("a", input_form = "scalar_list")))$a

Homogeneous JSON objects of scalars

Sometimes vectors are encoded as objects in JSON:

x_json <- '[
  {"a": {"x": 1, "y": 2}},
  {"a": {"a": 1, "b": 2, "b": 3}}
]'

x <- jsonlite::fromJSON(x_json, simplifyVector = FALSE)
str(x)

Use input_form = "object" in tib_vector() to parse them. To actually store the names use the names_to and values_to argument:

spec <- tspec_df(
  tib_int_vec(
    "a",
    input_form = "object",
    names_to = "name",
    values_to = "value"
  )
)

tibblify(x, spec)$a

Varying

Lists where elements do not have a common type but vary. For example:

:::: {style="display: grid; grid-template-columns: 1fr 1fr; grid-column-gap: 10px; width: 100%;"}

::: {}

[1, "a", true]

:::

::: {}

list(1, "a", TRUE)

:::

::::

can be parsed with tib_variant().

Object

The R equivalent to a JSON object is a named list where the names fulfill the requirements of vctrs::vec_as_names(repair = "check_unique").

:::: {style="display: grid; grid-template-columns: 1fr 1fr; grid-column-gap: 10px; width: 100%;"}

::: {}

{
  "a": 1,
  "b": true
}

:::

::: {}

x <- list(
  a = 1,
  b = TRUE
)

:::

::::

They can be parsed with tib_row(). For example

x <- list(
  list(row = list(a = 1, b = TRUE)),
  list(row = list(a = 2, b = FALSE))
)

spec <- tspec_df(
  tib_row(
    "row",
    tib_int("a"),
    tib_lgl("b")
  )
)

tibblify(x, spec)

Data Frames

List of objects

:::: {style="display: grid; grid-template-columns: 1fr 1fr; grid-column-gap: 10px; width: 100%;"}

::: {}

[
  {"a": 1, "b": true},
  {"b": 2, "b": false}
]

:::

::: {}

x <- list(
  list(a = 1, b = TRUE),
  list(a = 2, b = FALSE)
)

:::

::::

They can be parsed with tib_df().

Object of objects

A special form are named lists of object. In JSON they are represented as objects where each element is an object.

:::: {style="display: grid; grid-template-columns: 1fr 1fr; grid-column-gap: 10px; width: 100%;"}

::: {}

{
  "object1": {"a": 1, "b": true},
  "object2": {"b": 2, "b": false}
}

:::

::: {}

x <- list(
  object1 = list(a = 1, b = TRUE),
  object2 = list(a = 2, b = FALSE)
)

:::

::::

They are also parsed with tib_df() but you can parse the names into an extra column via the .names_to argument:

x_json <- '[
{
  "df": {
    "object1": {"a": 1, "b": true},
    "object2": {"a": 2, "b": false}
  }
}]'

x <- jsonlite::fromJSON(x_json, simplifyDataFrame = FALSE)

spec <- tspec_df(
  tib_df(
    "df",
    tib_int("a"),
    tib_lgl("b"),
    .names_to = "name"
  )
)

tibblify(x, spec)$df

Column major format

The column major format is also supported

:::: {style="display: grid; grid-template-columns: 1fr 1fr; grid-column-gap: 10px; width: 100%;"}

::: {}

{
  "a": [1, 2],
  "b": [true, false]
}

:::

::: {}

x <- list(
  a = c(1, 2),
  b = c(TRUE, FALSE)
)

:::

::::

via .input_form = "colmajor" in tspec_*():

df_spec <- tspec_df(
  tib_int("a"),
  tib_lgl("b"),
  .input_form = "colmajor"
)

tibblify(x, df_spec)


Try the tibblify package in your browser

Any scripts or data that you put into this service are public.

tibblify documentation built on Nov. 16, 2022, 5:07 p.m.