library(rmarkdown)
The vizlab
package was created to facilitate the rapid assembly of web-ready visualizations. It provides a framework for common steps that go into most/all visualizations, namely fetching, processing, and visualizing data, and finally publishing it in a format such as HTML. Other features such as Google Analytics are integrated as well. The package itself consists of R functions and scripts that carry out two distinct tasks. First they create a series of make files, which in turn execute other pieces of R code that carry out the fetch, process, visualize, and publish steps. Code for each step can be supplied by the user, and seamlessly integrated into the visualization through a central YAML control file (viz.yaml
). There are default methods for some common variants of these steps included as well, such as for retrieving data from USGS ScienceBase. Ultimately, this package aims to eliminate the technical tasks common to the construction of most vizualizations, and allow time to be focused on content creation and other creative tasks.
This vignette will outline all the steps necessary to use the package, utilizing an example visualization that can be found on github at https://github.com/USGS-VIZLAB/example.
Vizlab can be downloaded from github via devtools
:
install.packages('devtools') devtools::install_github('USGS-VIZLAB/vizlab')
The example visualization ('viz') is located at https://github.com/USGS-VIZLAB/example and should be cloned into a separate directory.
To build the example viz, your working directory should be set to the main example directory, and the vizlab
package should be loaded.
The viz.yaml
file manages all the different components of the viz, and is used to generate the make
files that build the final published product. viz.yaml
contains sections for each of the fetch, process, visualize, and publish steps, as well as general information about the viz. For each individual step, information is read from the viz.yaml
and passed to the appropriate internal or external function as a viz
object. This is simply a list with elements for each field of that yaml section. For example, this yaml section:
id: cars_data location: cache/fetch/cars.csv fetcher: cars mimetype: text/csv scripts: scripts/fetch/cars.R
becomes this R list object:
as.viz('cars_data')
$id [1] "cars_data" $location [1] "cache/fetch/cars.csv" $fetcher [1] "cars" $mimetype [1] "text/csv" $scripts [1] "scripts/fetch/cars.R" $block [1] "fetch" $export [1] FALSE attr(,"class") [1] "viz"
The viz object will inherit additional classes as it passes through different functions. Most importantly, defaults will be assigned for some items if you leave them unspecified (e.g., export
above) - see viz.defaults.yaml for all defined defaults.
Here we will examine the viz.yaml
included in the example viz:
vizlab: "0.1.5" info: id: example name: Simple but complete vizualization date: 2016-07-19 publish-date: 2016-08-02 path: /example analytics-id: UA-78530187-1 description: >- This is meant to touch all features and act as an integration test of the vizlab platform.
analytics-id
contains the Google Analytics (GA) ID to be used. The other fields here are self-explanatory and used to generate an index of all published visualizations to go in the footer of the finished product.
Each viz should have it's own GA ID. Follow these instructions to create a new ID.
fetch: - id: iris_data location: data/iris.csv mimetype: text/csv scripts: - id: car_data location: data/car_info mimetype: text/csv reader: folder scripts: - id: cuyahoga location: cache/fetch/cuyahoga.csv fetchargs: sourceloc: data/pretend_remote/cuyahoga.csv fetcher: cuyahoga scripts: scripts/fetch/cuyahoga.R mimetype: text/csv
Each individual piece in each section of the viz.yaml
should have it's own unique ID, so they can be referenced by the suceeding sections. Following sections will look in the location
field to find the corresponding file. mimetype
specifies file type, and is a required field unless a custom reader
is specified.
Two possibilites for fetchers are used here. iris_data
and car_data
are local files/folders already located in the data
folder, and so do not need to be fetched. They use the default fetcher, file
. Additional built-in fetchers, for ScienceBase and URLs, are defined in vizlab
and can also be used. cuyahoga
is a custom fetcher, located in scripts/fetch/cars.R
. It will retrieve the cars data (by reading in the default cars dataset) and save it to cache/fetch/cars.csv
.
The reader
specified will be used to read in the data for other steps. See methods('readData')
for a list of the readers included in the vizlab package. If not specified, the appropriate reader is inferred from mimetype
. Alternatively, a specific or custom reader can be specified. For example, the car_data
step indicates a specific folder
reader. If the reader is a custom reader written for this vizzy, the code should be stored in the scripts/process
directory.
process: - id: cuyahoga_short location: cache/process/cuyahoga_short.tsv mimetype: text/tab-separated-values scripts: scripts/process/cuyahoga.R depends: - cuyahoga processor: cuyahoga
depends
references the IDs in the fetch section. All processors must be supplied by the user. Just as the custom fetchers are stored in scripts/fetch
, processors are stored in scripts/process
. The output of each processor is stored in the directory specified by the location
field.
A processing step is not mandatory --- the iris_data
plot uses only the raw data, so it has no section here.
visualize: - id: relative_abundance_fig title: Relative Abundance alttext: Relative abundance of mayflies, above vs below location: figures/relative_abundance_fig.svg depends: cuyahoga: cuyahoga_short mayfly: mayfly_nymph sizes: plot_info visualizer: relative_abundance mimetype: image/svg+xml export: true scripts: scripts/visualize/relative_abundance.R
In the visualize step, location
specifies the location of the figure being output by each step. Visualizers are R functions stored in scripts/visualize/
. title
and alttext
become part of the finished viz, as a figure title and alt text that appears when the figure is hovered over. Note that there can be multiple dependencies -- relative_abundance_fig
depends on the processing step cuyahoga_short
, the raw data mayfly_nymph
from the fetch step, and plot parameters plot_info
. If any of these dependencies change, visualize_relative_abundance will be rerun and the figure updated.
- id: mainCSS location: layout/css/main.css mimetype: text/css publisher: resource - id: normalizeCSS location: layout/css/normalize.css mimetype: text/css publisher: resource - id: index name: index template: fullPage depends: relative_abundance: "relative_abundance" mainCSS: "mainCSS" normalizeCSS: "normalizeCSS" footer-style: "footer-style" footer: "footer" font: "pagefonts" publisher: page context: title: testViz sections: [ "relative_abundance" ] resources: [ "font", "mainCSS", "normalizeCSS", "footer-style"] footer: [ "footer" ] - id: relative_abundance template: simplefigure context: id: relative_abundance figure: relative_abundance_fig caption: "Relative Abundance" depends: relative_abundance_fig: "relative_abundance_fig" publisher: section - id: facebook-thumb location: images/facebook-thumb.png publisher: thumbnail for: facebook mimetype: image/png export: TRUE - id: landing-thumb location: images/landing-thumb.png publisher: thumbnail for: landing mimetype: image/png export: TRUE - id: footer publisher: footer template: footer-template depends: footer-style blogsInFooter: TRUE vizzies: - name: Microplastics in the Great Lakes org: USGS-VIZLAB repo: great-lakes-microplastics - name: Climate Change and Freshwater Fish org: USGS-VIZLAB repo: climate-fish-habitat blogs: - name: Using the dataRetrieval Stats Service url: https://owi.usgs.gov/blog/stats-service-map/ thumbLoc: https://owi.usgs.gov/blog/images/owi-mobile.png - id: footer-style location: layout/css/footer.css publisher: resource mimetype: text/css - id: pagefonts publisher: googlefont family: "Source Sans Pro" weight: [300, 400, 700]
The publish step creates the finished viz product. The first section applies to the viz as a whole. It is dependent on all the other publish sections for each figure and the text. name
will be the name of the finished viz HTML file. Each section with a figure depends on the corresponding visualize section. Publishers and templates are used in each section of process, and correspond to the sections's content. Default publisher R functions are stored in publish.R
in the vizlab
package, and default templates inside the vizlab package in inst/templates
. Templates are mustache files that define how the text or image content is displayed in the final HTML. For the example viz, the section
publisher is used for every individual section, and fullPage
publishes the full viz. Each section, including index
, has a context section that defines the actual material to be used. For figures it includes the caption. Note that text-only sections can be included here as well, using the printall
template (text-section
above). The figure-style
section references CSS included with the example viz package. The footer
section builds a footer for the web page which may contain links to other visualizations or footers. The footer
publisher gets viz information from theirGithub repos, while blog information is specified directly.
The following steps are required to start a viz from scratch. The first two have already been done for the example viz, so are not required here. These steps assume that the actual content creation phases are complete, i.e. there are scripts complete for data retrieval, processing, and figure creation.
Create the skeleton: First the visualization directory structure should be created, using the vizSkeleton
function. This is unnecessary for the example viz, since the directories are already set up in the repo. You can run this function in a dummy directory to see what happens.
Fill in the viz.yaml: As described above, viz.yaml
needs to be filled out in order to guide the creation of the various make
files. An empty skeleton viz.yaml
is created by the vizSkeleton
fuction. The complete viz.yaml
in the example viz repository can also be a useful reference.
Authentication (if needed): The dssecrets
package or a personal secret
vault can be used for authentication. Check with a DS team member for details. For security, it is strongly reccommended to use a ScienceBase account that does not utilize personal credentials.
Finally, execute all the created make files by running vizmake()
from the console:
vizmake() > vizmake() [ LOAD ] ( ) vizlab/remake/timestamps/cuyahoga.txt ( ) vizlab/remake/timestamps/never_current.txt Starting build at 2017-10-28 10:23:39 [ READ ] | # loading sources < MAKE > Viz [ BUILD ] plot_info | plot_info <- parameter("plot_info") [ READ ] | # loading packages [ BUILD ] data/site_text.yaml | fetch("site_text_data") [ BUILD ] vizlab/remake/timestamps/cuyahoga.txt | fetchTimestamp("cuyahoga") [ BUILD ] data/car_info | fetch("car_data") [ BUILD ] vizlab/remake/timestamps/never_current.txt | fetchTimestamp("never_current") [ BUILD ] layout/css/main.css | publish("mainCSS") [ BUILD ] layout/css/normalize.css | publish("normalizeCSS") [ BUILD ] images/facebook-thumb.png | publish("facebook-thumb") # etc
The finished HTML will be created in the target
directory, along with its corresponding CSS and images for each figure.
Every fetch item has a fetcher
, either implemented by vizlab
or within your code. There must be methods for both fetch
and fetchTimestamp
for every fetcher. Consider setting your custom fetchTimestamp.xxx
method to alwaysCurrent
or neverCurrent
if appropriate - these function names describe the currency of the local information relative to the remote information only, because dependencies on local information are taken care of elsewhere.
Timestamps are potentially checked (using fetchTimestamp
) every time you build a fetch
item with vizmake
. You can slow down the frequency of timestamp checks or eliminate checks altogether by placing a line in a file called preferences.yaml in your main project directory. For example:
timetolive: iris_data: Inf days cuyahoga: 6 hours
means that the iris_data
item will never have its timestamp checked, while cuyahoga
will have its timestamp checked on any build occuring more than 6 hours after the last time the timestamp was checked. The default (for unmentioned fetch items) is "0 days". If the timestamp is checked and has changed, the item with be re-fetched.
Edits to the viz.yaml chunk for the fetch item will also trigger re-fetches, regardless of the timestamp's value or whether it has exceeded its timetolive. Similarly, a fetch item that hasn't yet been built will always be built.
R package dependencies are stored in the required-packages
section of the viz.yaml
, with a repository (CRAN or GRAN generally) and version number. Don't include remake
on the required-packages
list.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.