Read and manipulate a tabular-data-resource

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(fr)

The {fr} package comes with an example frictionless tabular-data-resource (tdr) named hamilton_poverty_2020. On disk, a tdr is composed of a folder containing a data CSV file (both named based on the name of the tdr) and a tabular-data-resource.yaml file, which contains the metadata descriptors:

fs::dir_tree(fs::path_package("fr", "hamilton_poverty_2020"), recurse = TRUE)

Read the hamilton_poverty_2020 tdr into R by specifying the location of the tabular-data-resource file or to a folder containing a tabular-data-resource.yaml file:

d_fr <- read_fr_tdr(fs::path_package("fr", "hamilton_poverty_2020"))

Print the returned fr_tdr (frictionless tabular-data-resource) object to view all of the table-specific metadata descriptors and the underlying data:

d_fr

Print the schema property to view the table-specific metadata:

S7::prop(d_fr, "schema")

fr_tdr objects can be used mostly anywhere that the underlying data frame can be used because as.data.frame usually is used to coerce objects into data frames and works with fr_tdr objects:

lm(fraction_poverty ~ year, data = d_fr)

Accessor functions ([, [[, $) work as they do with data frames and tibbles:

head(d_fr$fraction_poverty)

In some cases, fr_tdr objects need to be disassociated into data and metadata before the data is manipulated and the metadata is rejoined:

#| error: true
d_fr |>
  dplyr::mutate(high_poverty = fraction_poverty > median(fraction_poverty))

In this case, explicitly convert the fr_tdr object to a tibble by dropping the metadata attributes using as_tibble, as_data_frame, or as.data.frame and then use as_fr_tdr() while specifying the original fr_tdr object as a template to convert back to a fr_tdr object:

d_fr |>
  tibble::as_tibble() |>
  dplyr::mutate(high_poverty = fraction_poverty > median(fraction_poverty)) |>
  as_fr_tdr(.template = d_fr)

Shortcuts are provided for some functions from {dplyr} (see dplyr_methods() for a full list).

d_fr |>
  fr_mutate(high_poverty = fraction_poverty > median(fraction_poverty)) |>
  fr_select(-year) |>
  fr_arrange(desc(fraction_poverty))

More complicated dplyr functions (e.g., group_by() and friends) as well as functions from other packages that do not coerce their inputs to data.frame objects will need to use the pattern above. Below is a simple example for dplyr::left_join():

library(dplyr, warn.conflicts = FALSE)

d_fr <- update_field(d_fr, "fraction_poverty", description = "the poverty fraction")

d_extant <-
  d_fr |>
  fr_mutate(score = 1 + fraction_poverty) |>
  fr_select(-fraction_poverty, -year) |>
  as_tibble()

d_fr_new <-
  left_join(
    as_tibble(d_fr),
    d_extant,
    by = join_by(census_tract_id_2020 == census_tract_id_2020)
  ) |>
  as_fr_tdr(.template = d_fr) |>
  update_field("score", description = "the score")

d_fr_new

S7::prop(d_fr_new, "schema")


Try the fr package in your browser

Any scripts or data that you put into this service are public.

fr documentation built on May 29, 2024, 8:35 a.m.