knitr::opts_chunk$set( collapse = TRUE, comment = "#>", # Set several aspects of graph chunks fig.retina=4, fig.align='center', fig.height=4, fig.width=6 )
The MR-PFU Database contains primary, final, and useful energy and exergy data for all countries covered by the International Energy Agency (IEA). Aggregations are also available.
But it can be a challenge to get started with the MR-PFU Database
if you are unfamiliar with some types of R programming.
This vignette provides a detailed example of working with MR-PFU Database information
to help users get over that hurdle.
In particular, this vignette shows how to create a graph of country-level energy and exergy efficiencies first presented at the International Exergy Economics Workshop in July 2023 (IEEW2023) by Matthew Kuperus Heun (Calvin University). In the process, important concepts for accessing and working with the data are demonstrated, including
Most R programs use packages that are loaded at the beginning
of a session.
The following packages are used in this vignette.
library(devtools) library(dplyr) library(ggplot2) library(MKHthemes) library(PFUSetup) library(pins) library(readxl) library(tidyr)
All packages are available on the Comprehensive R Archive Network (CRAN), except MKHthemes and PFUSetup. Instructions for loading MKHthemes and PFUSetup are given below.
The first step is to establish a "pinboard"
for the pins package.
Someone from the MR-PFU Database team will have shared the "PipelineReleases"
folder if you have access to IEA data.
The PFUSetup package is helpful to find the correct folder.
Install the PFUSetup package with the following code.
devtools::install_github("EnergyEconomyDecoupling/PFUSetup")
Create the pinboard from the correct Dropbox folder.
Your pinboard_folder might be different from this example,
due to a different location on your computer.
pinboard_folder <- PFUSetup::get_abs_paths()[["pipeline_releases_folder"]] pinboard_folder pinboard <- pins::board_folder(pinboard_folder, versioned = TRUE) pinboard
Next, users can consult the versions and products.xlsx file
located in the "PipelineReleases" folder.
to find the desired database product.
For this example,
we want data for pf, fu, and pu efficiencies,
which can be found in Product E,
whose pin is named "agg_eta_pfu".
The "versions and products.xlsx" file indicates that
the pin version of the "agg_eta_pfu" product for database v1.1 is
"20230619T051304Z-f653c".
Putting it all together,
the following code reads the desired version of the desired data
for v1.1 of the PFU database.
agg_eta_pfu_data <- pins::pin_read(board = pinboard, name = "agg_eta_pfu", version = "20230619T051304Z-f653c")
agg_eta_pfu_data is an R data frame as shown below:
agg_eta_pfu_data
The columns have the following meanings:
In the code below, we prepare the data
by filtering rows and selecting columns
of the agg_eta_pfu_data data frame
to focus on the data of interest for this graph.
Comments in the code narrate the process.
agg_eta_pfu_prepared <- agg_eta_pfu_data |> # Select only those rows where we have both IEA and MW data, # where gross energy production is provided, and # where aggregations were calculated from fully specified data. dplyr::filter(IEAMW == "Both", GrossNet == "Gross", Product.aggregation == "Specified", Industry.aggregation == "Specified") |> # Select only the columns of interest to us dplyr::select(Country, Energy.type, Year, eta_pf, eta_fu, eta_pu) |> # Pivot the data to create a value column tidyr::pivot_longer(cols = c(eta_pf, eta_fu, eta_pu), names_to = "Var.name", values_to = "Value") # Here is the prepared data agg_eta_pfu_prepared
agg_eta_pfu_prepared is a data frame that contains
everything we need to make the graph.
We use the ggplot2
package to make the graph.
agg_eta_pfu_prepared |> # Set the x and y axes ggplot2::ggplot(mapping = ggplot2::aes(x = Year, y = Value, # Group by Country # to make individual lines # for each country group = Country)) + # Create a line graph ggplot2::geom_line() + # Put mini-graphs for energy type in rows and efficiency type in columns ggplot2::facet_grid(rows = ggplot2::vars(Energy.type), cols = ggplot2::vars(Var.name))
The initial version of the graph provides confidence that we can use the data, but it is not attractive.
Many of these changes are made by adjusting the
values in the metadata columns
of the agg_eta_pfu_prepared data frame.
The code below shows those adjustments.
agg_eta_pfu_nice <- agg_eta_pfu_prepared |> dplyr::mutate( # We use the string "Asia_" to indicate # our aggregation for Asia. # Change to "Asia" (without the trailing "_"). Country = dplyr::case_when( Country == "Asia_" ~ "Asia", TRUE ~ Country ), # Change "E" and "X" to "Energy" and "Exergy" Energy.type = dplyr::case_match( Energy.type, "E" ~ "Energy", "X" ~ "Exergy" ), Var.name = dplyr::case_match( Var.name, # Change names of efficiencies # for later subscripts of "pf", "fu", and "pu" "eta_pf" ~ "eta[pf]", "eta_fu" ~ "eta[fu]", "eta_pu" ~ "eta[pu]" ), # Set ordering for efficiency variables # so they look nice in our graph Var.name = factor(Var.name, levels = c("eta[pf]", "eta[fu]", "eta[pu]")) )
To further improve appearance, we can colour the individual country lines by continent. And, we can highlight the world as a thick line atop all countries.
The first step is to join continent information to the efficiency data, thereby making continent data available for the graph. We can load a data frame of countries and continents and join it to the efficiency data frame.
# The continents file is an input # to the calculation pipeline # that created the database. # Users can create their own # continent mapping data frame # or ask for ours. continents_file <- PFUSetup::get_abs_paths()[["aggregation_mapping_path"]] continents_file continents_agg_map <- continents_file |> # Read the continent_aggregation tab of the Excel file readxl::read_excel(sheet = "continent_aggregation") |> # Adjust the column names dplyr::rename( Country = Many, Continent = Few ) |> dplyr::mutate( # Fix "Asia_" --> "Asia" # to match the efficiency data frame. Continent = dplyr::case_when( Continent == "Asia_" ~ "Asia", TRUE ~ Continent ) ) continents_agg_map # Join to the efficiency data agg_eta_pfu_with_continents <- agg_eta_pfu_nice |> dplyr::left_join(continents_agg_map, by = "Country") # Notice the Continent column agg_eta_pfu_with_continents
Establish the continent colors.
continent_colours <- c(Africa = "orange", Asia = "gray", Bunkers = "darkgray", Europe = "darkgreen", MidEast = "cadetblue3", NoAmr = "red", Oceania = "blue", SoCeAmr = "violet", World = "black")
To create a specific World-average line, we need to separate the World data from all countries. Also, continent averages should be removed from the data frame that contains country lines.
# Create a data frame of only World information # for years >- 1971 # (the first year when all countries are in IEA data). # This data frame will be used to create the # World-average line agg_eta_pfu_world <- agg_eta_pfu_with_continents |> dplyr::filter(Country == "World", Year >= 1971) # Create a data frame of countries only, # eliminating continent and world averages agg_eta_pfu_countries <- agg_eta_pfu_with_continents |> dplyr::filter(!(Country %in% c("Africa", "Asia", "Bunkers", "Europe", "MidEast", "NoAmr", "Oceania", "SoCeAmr", "World", "WRLD")))
We will use the MKHthemes package to make nice-looking graphs with a white background and tick marks that point inward, not outward. MKHthemes can be loaded with the following code.
devtools::install_github("MatthewHeun/MKHthemes")
Now make the graph. Details different from the previous version are identified by comments.
ggplot2::ggplot(mapping = ggplot2::aes(x = Year, y = Value, group = Country, # Set country colours by continent colour = Continent)) + # Create two different line layers. # The first has countries only and a narrow linewidth. ggplot2::geom_line(data = agg_eta_pfu_countries, linewidth = 0.05) + # The second (upper) layer has World only, # a larger linewidth and black colour. ggplot2::geom_line(data = agg_eta_pfu_world, linewidth = 1, colour = "black") + # Set only a few years on the x-axis so the labels # don't collide ggplot2::scale_x_continuous(breaks = c(1980, 2000, 2020)) + # Use the continent_colours for the lines ggplot2::scale_colour_manual(values = continent_colours) + # Eliminate the x-axis label ("Year"); # everyone knows those are years. # Eliminate the colour label, "Continent"; # everyone knows these are continent names. ggplot2::labs(x = NULL, y = expression(eta), colour = NULL) + ggplot2::facet_grid(rows = ggplot2::vars(Energy.type), cols = ggplot2::vars(Var.name), labeller = ggplot2::label_parsed) + # Set pleasant aesthetics such as white background, # gray border lines, and # inward-pointing tick marks. MKHthemes::xy_theme() + ggplot2::guides( colour = ggplot2::guide_legend( # Adjust the size of the line segments in the legend # independent from linewidths in the graph itself override.aes = list(linewidth = 1) ) )
The new version of the graph is aesthetically pleasing and conveys more information than the initial attempt.
The PFU Database provides data to make graphs such as primary-to-final, final-to-useful, and primary-to-useful efficiencies. This vignette shows details of how to read and clean data, how to make such graphs, and how to modify the graphs to be aesthetically pleasing.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.