knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Overview

The purpose of this article is to demonstrate how to use data provided by the Vega datasets Python package.

Here's the short version:

Importing

In the Altair documentation, you will see this code used often:

from vega_datasets import data

cars = data.cars()

The Altair convention is to use the name data to refer to the data object in the vega_datasets package. This package offers a similar convention:

library("altair")

vega_data <- import_vega_data()

cars <- vega_data$cars()

Instead, our convention is to use an object called vega_data.

Accessing

Our vega_data object has a method to list all its datasets:

vega_data$list_datasets() %>% head()

Each dataset can be accessed using a method whose name is an element returned from vega_data$list_datasets().

library("tibble")

vega_data$anscombe() %>% as_tibble()

It is useful to keep in mind that reticulate changes the names of the datasets, and presumably, Python objects in general. Where you see a - in a name of a Python object, a _ will be used in the name of the reticulated object in R. For example, in Python: data.sf-temps(); in R:

vega_data$sf_temps() %>% as_tibble()

Metadata

Each dataset has some metadata, such as a description and references.

wrapcat <- function(x) {
  x %>% strwrap() %>% cat(sep = "\n")
}

vega_data$anscombe$description %>% wrapcat()
vega_data$anscombe$references %>% wrapcat()

Some of the datasets are stored locally as a part of the Vega datasets Python package, others are not. The method that returns the data, e.g. vega_data$anscombe() will do the right thing. You can use the is_local property to find out what the right thing is for a dataset.

vega_data$anscombe$is_local

Each dataset has a remote URL, which you can use instead of a data frame in any Altair data argument.

vega_data$anscombe$url

Using a URL

You can specify data using a URL that points to a dataset, rather than using a data frame explicitly.

cars_url <- vega_data$cars$url

chart_cars <- 
  alt$Chart(cars_url)$
  encode(
    x = "Weight_in_lbs:Q",
    y = "Miles_per_Gallon:Q",
    color = "Cylinders:N"
  )$
  mark_point()

chart_cars

This works in your browser, but not might not work in the RStudio IDE. This is because, for security reasons, the RStudio IDE may not let you refer external URLs that are not on their allow-list (such as YouTube and Vimeo). If you open this up in a browser, it works just fine (as long as you have access to the internet).



vegawidget/altair documentation built on Feb. 3, 2024, 7:47 p.m.