knitr::opts_chunk$set( collapse = TRUE, comment = "#>")
gtfs2emis
is an R package to estimate hot exhaust emissions of public transport vehicles using data in the General Transit Feed Specification (GTFS) format. The package allows users to estimate the emission levels of several types of pollutants for each public transport vehicle/trip/route at high spatial and temporal resolutions simply using a GTFS feed and some information on fleet characteristics. This vignette introduces the main functions of the gtfs2emis
package and shows a step-by-step reproducible example of how to use the package.
One can install the development version gtfs2emis
from Github:
library(devtools) # From CRAN install.packages("gtfs2emis") # Dev. version with latest features # install.packages("remotes") remotes::install_github("ipeaGIT/gtfs2emis")
A few samples of GTFS and fleet files are included in the package:
data_path <- system.file("extdata", package = "gtfs2emis") list.files(data_path)
bra_cur_gtfs.zip
and bra_cur_fleet.txt
irl_dub_gtfs.zip
and irl_dub_fleet.txt
usa_det_gtfs.zip
and usa_det_fleet.txt
To estimate the emission levels from a given public transport system, users need to:
GTFS.zip
filedata.frame
with a few characteristics of the public transport fleet (such as age, vehicle type, and fuel)Before we start, let's load a few packages we'll be using in this vignette:
library(gtfs2emis) library(gtfstools) library(progressr) library(data.table) library(ggplot2) library(units) library(sf)
The gtfs2emis
package has two core functions:
transport_model()
. This function converts GTFS data into a GPS-like data.table
with the space-time positions and speeds of public transport vehicles.emission_model()
. This function estimates hot-exhaust emissions based on four inputs: transport_model()
;
b) a string
indicating which emission factor model should be considered;
c) a string
indicating which pollutants should be estimated; and
d) a data.frame
with info on fleet characteristics passed by the user.
The function returns a list
with the estimated amount of each pollutant emitted by public transport vehicles.To help users analyze the output from emission_model()
, the gtfs2emis
package has few functions:
emis_summary()
to aggregate emission estimates by the time of the day, vehicle type, or road segment.emis_grid()
to spatially aggregate emission estimates using any custom spatial grid or polygons.emis_to_dt()
to convert the output of emission_model()
from list
to data.table
.In this introductory vignette, we show a very simple case study using default parameters of the gtfs2emis
package and where we assume that fleet characteristics are homogeneously distributed across the public transport routes. For advanced users, we have written another vignette that specifies the usage of different emissions factors. We have also written a separate vignette to help users build the data.frame
with information on fleet characteristics.
To demonstrate how the gtfs2emis
package works, we will be using a small sample of data for the city of Dublin, Ireland. In this example, we'll be estimating NOx
and PM10
emissions of bus services on business days.
The first step is to generate the transport model using transport_model{gtfs2emis}
. This function converts GTFS data into a GPS-like data.table
, transforming the output into a sf-linestring
, which is the required input for emissions estimates. The user can input either a string with the file path where the a gtfs.zip
file is stored or an object of class "gtfs" "list"
, generated with gtfstools::read_gtfs()
.
First, let's read the GTFS data and filter only the transport services that run on
# path to GTFS.zip file gtfs_file <- system.file("extdata/irl_dub_gtfs.zip", package = "gtfs2emis") # read GTFS gtfs <- gtfstools::read_gtfs(gtfs_file) # Keep Monday services GTFS gtfs <- gtfstools::filter_by_weekday(gtfs, weekday = c('saturday', 'sunday'), keep = FALSE)
Now let's generate the transport model. This is the most time-consuming part. If you want to, you can set a progress bar by calling the transport_model()
function within progressr::with_progress()
, as shown below.
# generate transport model progressr::with_progress( tp_model <- transport_model(gtfs_data = gtfs, min_speed = 2, max_speed = 80, spatial_resolution = 100, parallel = TRUE) ) head(tp_model)
Here is what the output of the transport model looks like. In essence, it's a trajectory data.table sf linestring
with the space-time position and speed of trip segments for every single vehicle of the public transport system.
ggplot(data = tp_model) + geom_sf(aes(color= as.numeric(speed))) + scale_color_continuous(type = "viridis")+ labs(color = "Speed (km/h)")+ theme_void()
The next step is to prepare a data.frame
with some characteristics of the public transport fleet. This can be either:
- A simple table with the overall composition of the fleet. In this case, the gtfs2emis
will assume that fleet is homogeneously distributed across all routes; OR
- A detailed table that (1) brings info on the characteristics of each vehicle and, (2) tells the probability with which each vehicle type is allocated to each transport route.
In this introductory vignette, we'll be working with a simple table that tells us the proportion of buses in Dublin according to their vehicle type, Euro standard, technology, and fuel. The table looks like this:
fleet_file <- system.file("extdata/irl_dub_fleet.txt", package = "gtfs2emis") fleet_df <- read.csv(fleet_file) head(fleet_df)
Please note different emission factor models may require different information and that the fleet data.frame
needs to be organized accordingly. In our current example for the city of Dublin, the fleet data must include certain columns with the fleet characteristics that are used in the EMEP-EEA emission factor model: vehicle type, Euro standard, technology, and fuel. To check which columns and sets of vehicle characteristics are required by a given emission factor model, check the fleet data vignette.
In the final step, we use the emission_model{gtfs2emis}
function to estimate the hot exhaust emissions of our public transport system. Here, the user needs to pass the results from transport_model()
, some fleet data as described above, and select which emission factor model and pollutants should be considered. The emission_model()
outputs a list
with several vectors
and data.frames
with emission estimates and related information such as vehicle variables (fuel
, age
, tech
, euro
, fleet_composition
), travel variables (slope
, load
, gps
) or pollution (EF
, emi
).
emi_list <- emission_model(tp_model = tp_model, ef_model = "ef_europe_emep", fleet_data = fleet_df, pollutant = c("CO2","PM10"), reference_year = 2020 ) names(emi_list)
Sometimes, working with data stored in a long list can be tricky. For convenience, gtfs2emis
package has a few functions to help users analyze the results of emission estimates. For example, you can use the emis_to_dt()
function to convert the output of emission_model()
from a list
to a data.table
format:
emi_dt <- emis_to_dt(emi_list = emi_list, veh_vars = c("veh_type", "euro", "fuel"), pol_vars = "pollutant" ) head(emi_dt)
Alternatively, the emis_summary()
function helps users summarize emission estimates, aggregating emissions by pollutant, time of the day, vehicle or road segment (spatial).
emi_by_pol <- emis_summary(emi_list = emi_list, by = "pollutant") emi_by_pol
The emis_summary()
function help users summarize emission estimates by either veh_type
, pollutant
, or time
. Using this function, users can easily:
emi_by_veh <- emis_summary(emi_list = emi_list, by = "vehicle") head(emi_by_veh)
It is also possible to summarize emission estimates by other vehicle characteristics AND pollutants.
emi_by_veh <- emis_summary(emi_list = emi_list, by = "vehicle", veh_vars = c("veh_type","euro")) head(emi_by_veh) # plot ggplot(data = emi_by_veh) + geom_col(aes(x = euro, y = as.numeric(emi/1000), fill = pollutant), color=NA, show.legend = FALSE) + labs(y="Total emissions (Kg)", x="Euro standard") + facet_wrap(~pollutant, scales = "free", nrow = 2) + theme_minimal()
emi_by_time <- emis_summary(emi_list = emi_list, by = "time") head(emi_by_time) # plot ggplot(data = emi_by_time) + geom_col(aes(x = factor(timestamp_hour), y = as.numeric(emi/1000), fill = pollutant), color=NA, show.legend = FALSE) + labs(y="Total emissions (Kg)", x="Hour of the day") + facet_wrap(~pollutant, scales = "free", nrow = 2) + theme_minimal()
Finally, users can analyze how public transport emissions are spatially distributed. To do this, the emis_grid()
function helps aggregate emission estimates over any custom spatial vector data (sf POLYGON
).
Let's create a regular hexagonal grid for this example.
# create spatial grid mygrid <- sf::st_make_grid( x = sf::st_make_valid(emi_list$tp_model) , cellsize = 0.25 / 200 , crs= 4326 , what = "polygons" , square = FALSE) ggplot() + geom_sf(data=mygrid) + theme_void()
mygrid_emi <- emis_grid(emi_list, mygrid,time_resolution = "day" ,quiet = FALSE) ggplot() + geom_sf(data = mygrid_emi, aes(fill= as.numeric(CO2_Euro_III)), color=NA) + geom_sf(data = emi_list$tp_model$geometry,color = "black")+ scale_fill_continuous(type = "viridis")+ labs(fill = "CO2 (g)")+ theme_void()
Check out our extra guides:
If you have any suggestions or want to report an error, please visit the package GitHub page.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.