knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(ldf)

Linked-data uses the Resource Description Framework (RDF) to identify resources with Uniform Resource Identifiers (URIs) and describe them with a set of statements, each specifying the value of a given property for the resource. We can represent this in R using a character vector for the URIs together with a data frame for the descriptions. That data frame should include a URI column to identify the resource being described in each row.

uris <- c("http://example.net/id/apple",
          "http://example.net/id/banana",
          "http://example.net/id/carrot")
labels <- c("Apple","Banana","Carrot")
descriptions <- data.frame(uri=uris, label=labels)

food <- resource(uris, descriptions)

The resource() constructor returns a ldf_resource object that has a variety of methods defined on it, including the format() generic which allows us to use the labels instead of the URIs when printing to the console.

(picnic <- data.frame(food=food, quantity=c(3,2,0)))

The contents of the vector itself can vary from the attached descriptions this allows you to repeat values without needing to duplicate descriptions:

(kitchen <- data.frame(
  dish=c("Fruit Salad", "Fruit Salad",
         "Carrot Salad", "Carrot Salad"),
  food=resource(c("http://example.net/id/apple", "http://example.net/id/banana",
                  "http://example.net/id/apple", "http://example.net/id/carrot"),
                descriptions),
  quantity=c(2,2,1,3)))

The underlying identity of each resource in the vector can be retrieved with uri():

uri(food)

There's also the curie() function for retreiving URIs compacted with prefixes (see also: default_prefixes()):

curie(food, prefixes=c(food="http://example.net/id/"))

You can retrieve the descriptions with description():

description(food)

In order to access individual properties from the resource's description, you can use property():

property(food, "label")

The second argument is the name of a column from the description.

Since label is such a commonly used property, there's also a function provided for it: label(). You can use these functions to perform operations on resources in terms of their descriptions:

food[label(food) == "Apple"]

We use the label to pretty print linked data frames with format.ldf_resource():

format(kitchen)

Because the base type of resources is character, R will tend to dispatch on this basis. To prevent functions from base R (or other packages that aren't expecting any novel S3 vectors) from misinterpreting resources, we have as.character() return the URI and not the resource's label:

as.character(food)

This can be unexpected. The table() function, for example, returns counts by URI:

table(kitchen$food)

You can use the label by calling table on that instead:

table(label(kitchen$food))

You can convert a linked data frame back into a "normal" data frame (i.e. one not containing vectors of RDF resources) using the as_dataframe_of_labels() function. This converts RDF resources into their labels:

kitchen_labels <- as_dataframe_of_labels(kitchen)

str(kitchen_labels)


Swirrl/linked-data-frames documentation built on Sept. 14, 2022, 6:15 p.m.