In DataScienceScotland/vedar: Import and Process TIMES Results Generated By Veda

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>", 
  echo = T, 
  message = F, 
  warning = F
)

library(vedar)
library(dplyr)
library(purrr)
library(igraph)

Introduction

Veda is a graphical user interface for running TIMES models. Veda compiles model information specified in spreadsheets and passes the resulting data to GAMS and CPLEX for solving. The solution data is stored in custom veda text files.

The Veda graphical user interface provides a means of interrogating the Veda solution output data. The vedar library offers an alternative and enables veda outputs to be imported and analysed in R.

Getting veda data into R

The vignette uses the data that is created from processing demos model that were run in Veda2.0.

Veda creates several filetypes during a run. The following data files are currently handled in VedaR:

.vd: The data that appears in the solution
.vds: The set information data
.vde: The description data for commodities, processes and user constraints
.vdt: The technologies and user constraints that exist in the input data, including the direction of commodity flows. This file can be used to create the structure of the reference energy system that is available to the model.

filename_base <- "DemoS_001"

vd_filename <- paste(filename_base, ".vd", sep = "")
vds_filename <- paste(filename_base, ".vds", sep = "")
vde_filename <- paste(filename_base, ".vde", sep = "")
vdt_filename <- paste(filename_base,  ".vdt", sep = "")

The vedar package has four functions used to load Veda model run output data into R.

The functions import_vd(vd_filename), import_vds(vds_filename), import_vde(vde_filename), import_vdt(vdt_filename) bring the veda run output data, the set information, the object descriptions and the network structural data respectively into the the workspace. In the code below, the system_file() is needed in the function arguments to access the vignette data, but this is not required when running in a regular project.

demos_vd <- import_vd(system.file("extdata", 
                                  vd_filename,
                                  package = "vedar"))

demos_vde <- import_vde(system.file("extdata", 
                                  vde_filename,
                                  package = "vedar"))

demos_vds <- import_vds(system.file("extdata", 
                                  vds_filename,
                                  package = "vedar"))

demos_vdt <- import_vdt(system.file("extdata", 
                                  vdt_filename,
                                  package = "vedar"))

Alternatively, the three first files can be imported and joined using prep_data(filename_base). The vignette = T argument is used in the vignette, but is not needed in a project.

demos_001 <- prep_data(filename_base = filename_base, vignette = T)

The .vdt data can be standardised in the same format as the output of prep_data() using prep_vdt_data(). Note, all later functions rely on the output format of these two functions.

demos_001_vdt <- prep_vdt_data(
  filename_base = filename_base,
  refer_to_package_data = T)

Assigning sectors

Processes and commodities are likely to be assigned to particular grouping variables (e.g. sectors). A list of sector, or other grouping information can be associated to the data using define_sector_from_list(). As each row of the data has either or both commodity and process information, the sector is assigned by separate function calls on commodity and process data.

Note that the joining variable can have different names in the veda output data, and the sector information data. The former information is passed as a character (join_variable_name) and the latter as a column name (sector_dat_join_variable_col). The name of the grouping information is passed as a column (sector_info_column).

The functionality of this will be improved in future iterations.

sectors_commodity_file <- "sectors_commodity.csv"

sectors_process_file <- "sectors_process.csv"
sectors_commodity <- system.file("extdata", 
                                  sectors_commodity_file,
                                  package = "vedar") %>%
  read.csv() %>%
  #convert to lower case strings
  prep_sector_dat()
sectors_process <- system.file("extdata", 
                                  sectors_process_file,
                                  package = "vedar") %>%
  read.csv() %>%
  #convert to lower case strings
  prep_sector_dat()

demos_001 <- demos_001 %>%
  #assign the sector to the commodity
  define_sector_from_list(join_variable_name = "commodity", 
                          sector_dat = sectors_commodity, 
                          sector_info_column = sector, 
                          sector_dat_join_variable_col = commodity) %>%
  #assign the sector to the process
  define_sector_from_list(join_variable_name = "process", 
                          sector_dat = sectors_process, 
                          sector_info_column = sector, 
                          sector_dat_join_variable_col = process) %>%
  # here, assign a sector based on the process sector if available,
  # else, use the commodity information
  mutate(sector = if_else(is.na(process_sector), 
                          commodity_sector, 
                          process_sector), 
         #create a variable to check that commodity sector and process
         # sector information match
         sector_match = if_else((is.na(process_sector) |
                                   is.na(commodity_sector) |
                                   process_sector == commodity_sector), 
                                T, F))
print(paste("Check if sector information matches between process and commodity: ", 
            sum(demos_001$sector_match) == nrow(demos_001), sep = ""))

Comparing the results to veda output tables

We can compare results produced by vedar to those produced from tables generated by the Veda graphical-user interface

# load results tables that were generated by Veda DemoS_001 example
data("demos_001_results")

flatten_dfr(demos_001_results["All costs.csv"]) %>%
  select(Attribute, 
         Commodity, 
         Process, 
         Period, 
         Region,
         Pv) %>%
  arrange(Attribute, Process, Period) %>%
  knitr::kable()


demos_001 %>% 
  #select rows in which attribute contains the string "cost_" 
  # and sector == "coal"
  filter(grepl("cost_", attribute), 
         sector == "coal") %>%
  select(attribute, 
         commodity, 
         process, 
         period, 
         region,
         pv) %>%
  arrange(attribute, process, period)  %>%
  knitr::kable()

Representing the Reference Energy System as a graph object

Graphs/networks are representations of connected systems. The RES can be represented as a graph with processes as nodes, and commodities as edges.

make_graph_from_veda_df() converts the RES to an igraph object. The function can be applied to either vd, or vdt data using the input_data_type argument (the default is "vd").

In the case of vd data, when the data is for a single year, the edge weights are the values are determined by the var_fin and var_fout variables of the given process/commodity. Note that since TIMES does not give information of how a given var_fout is split over downstream commodities, an assumption has to be made. Here, the weight is assigned in proportion to the ratio of var_fin variables of downstream commodities linked to the originating process.

In all other cases, the edges weights are set to 1.

Below, we create igraph objects for the demos_001 model from each of the vd, and vdt data

data(demos_001_sector)
g <- demos_001_sector %>%
      filter(period == 2006) %>%
      make_graph_from_veda_df(node_labels = process,
                              edge_labels = commodity
                              )
#load data if not previously loaded in vignette
data(demos_001_vdt)
g_vdt <- demos_001_vdt %>% 
  filter(region == "reg1") %>% 
  make_graph_from_veda_df(input_data_type = "vdt")

Converting the RES to a graph object enables the use of the network analysis functionality of the igraph package (https://igraph.org/r/html/latest/). For example, we can return all the edge weights

E(g)$weight
E(g_vdt)$weight

We can compute all simple paths from a given node (or between two nodes if the to = argument is specified)

all_mincoa1_paths <- all_simple_paths(g, from = "mincoa1")
all_mincoa1_paths

The vedar function check_in_paths provides a means of checking whether a string expression is included in the set of paths

check_in_path("(exp)", all_mincoa1_paths)

We can check whether an exp process is included in the paths that are linked to the coa commodity as follows

#find all coa edges
coa_edges <- which(E(g)$commodity == "coa")

#find the start vertices of the coa commodity edges
coa_start_vertices <- ends(g, coa_edges)[,1]

#find all the paths that are linked to the coa_start_vertices

all_coa_paths <- all_simple_paths(g, from = unique(coa_start_vertices))

#check if the string "exp" appears in the set of paths that project from coa_start_vertices

check_in_path("(exp)", all_coa_paths)

#check if the string "dist" appears in the set of paths that project from coa_start_vertices

check_in_path("(dist)", all_coa_paths)

Viewing the Reference Energy System

The RES is a visual representation of processes and commodity flows. Vedar uses a Sankey diagram representation of the RES, with processes as nodes, and commodity flows as edges. The function make_res() creates the RES for a selected period, region and sector.

Here, we use the dataset from the DemoS_007 default run (DemoS_007). This is a multiregion model. The data has not had a sectors defined, so we first append sector variable for demonstration purposes.

We first use the option of labeling nodes by the process_description. We label the edges by the commodity_description column, and set the font_size of the node labels.

The arguments sankey_width, sankey_height and font_size are optional. These should be adjusted heuristically to get the desired size.

The function can only be used for data for a single region, so data should be prefiltered.

Plotting the reg1 RES for 2020:

data("demos_007")

demos_007 <- demos_007 %>%
  filter(region == "reg1", 
         period == 2020) %>% 
  mutate(sector = if_else(grepl("(oil)", process), "oil", "other"))

res_all <- make_res(dat = demos_007 , 
                node_labels = process_description, 
                edge_labels = commodity_description, 
                sankey_width = 1500,
                sankey_height = 800,
                font_size = 25)

res_all

The flow labels are shown as tooltips when the mouse hovers over the edges. For visual representation, all demand commodities are shown to flow into a "demand commodity"_end_process. The process nodes can be moved in the vertical plane by dragging.

To create a RES for data filtered on any of the columns in the data, the data can be filtered using dplyr functions. Here, we filter for sector.

res_filter <- make_res(dat = demos_007 %>% 
                         filter(sector == "oil"),
                node_labels = process_description, 
                edge_labels = commodity_description, 
                sankey_width = 1500,
                sankey_height = 800,
                font_size = 25)

res_filter

The data can also be filtered on sets. Since sets are lists, this needs to use the purrr::map_*()(https://purrr.tidyverse.org/reference/map.html) functions in the filter call. Here, we filter the data to that in which the string "nrg" appears in the commodity set.

res_filter_set <- make_res(dat = demos_007 %>% 
                         filter(map_lgl(commodity_set, 
                                        ~("nrg" %in% .x))),
                node_labels = process_description, 
                edge_labels = commodity_description, 
                sankey_width = 1500,
                sankey_height = 800,
                font_size = 25)

res_filter_set

To make the diagram less cluttered, the node label can use the process code.

res_prc_code <- make_res(dat = demos_007 %>% 
                         filter(map_lgl(commodity_set, 
                                        ~("nrg" %in% .x))),
                node_labels = process, 
                edge_labels = commodity_description, 
                sankey_width = 1500,
                sankey_height = 800,
                font_size = 25)


res_prc_code

The Sankey uses unitary magnitudes if the data includes more than a single period, or if use_weights = F. This is done in the two chunks below.

#reload the data to include all periods
data("demos_007")

res_all_unitary_all <- make_res(dat = demos_007 %>% 
                              filter(region == "reg1"), 
                node_labels = process_description, 
                edge_labels = commodity_description, 
                sankey_width = 1000,
                sankey_height = 1000,
                font_size = 10)

res_all_unitary_all

res_all_unitary_2020 <- make_res(dat = demos_007 %>% 
                              filter(region == "reg1", 
                                     period == 2020), 
                node_labels = process_description, 
                edge_labels = commodity_description, 
                sankey_width = 1000,
                sankey_height = 1000,
                font_size = 10, 
                use_weights = F)

res_all_unitary_2020

To create the RES for all technologies available to the model (independent of period), the vdt data can be passed to make_res() using the input_data_type argument. This creates unitary edge weights.

data(demos_007_vdt)
  make_res(demos_007_vdt %>% 
             filter(region == "reg1"), 
          input_data_type = "vdt", 
          sankey_height = 800, 
          sankey_width = 1000)

Combining graph queries to create a RES

The make_res_from_graph() function creates a Sankey representation of the the res from an igraph object. Edge thickness represents the graph edge weights.

We first do this for the graph object g created from demos_001. The edge_labels argument specifies the edge attribute used to label the edges. The node labels are taken from the graph object. The sankey_height and sankey_width arguments are specified in pixels.

res_graph_001 <- make_res_from_graph(g, 
                                     edge_labels =
                                       commodity_description, 
                                     sankey_width = 1500,
                                     sankey_height = 800,
                                     font_size = 25)

res_graph_001

Using a graph representation to create a Sankey allows us to use subgraphs as input. For example, the subgraph of g which contains all paths from mincoa1 can be created as follows

# get the vertices in the paths of interest
vert <- unique(names(unlist(all_mincoa1_paths)))

#create the subgraph
sub_g <- induced_subgraph(g, vert)

# make res from sub_g
res_sub_graph_001 <- make_res_from_graph(sub_g, 
                                         sankey_width = 1500,
                                         sankey_height = 800,
                                         font_size = 25)
res_sub_graph_001