In lhenneman/disperseR: Run HYSPLIT many times in parallel, aggregate to zip code level, plot the results, save the plots.

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

In this vignette you will see how to use the disperseR package. We will walk you through an example step by step. We will be using functions that will automatically download the needed data for you to be able to run a complete analysis for the United States. We also provide other vignettes that show you how we processed the data and how to load datasets one by one (optional).

The step up

We start by importing the packages that we will use in the analysis. Make sure you have them installed. The functions in disperseR are based on the data.table package for faster computation, but this vignette contains many functions from the tidyverse family to illustrate another syntax. First, install a slightly modified version of SplitR.

devtools::install_github( 'lhenneman/SplitR')

library(disperseR) # our package
library(ncdf4)
library(data.table)
library(tidyverse)
library(parallel)
library(sf)
library(viridis)
library(ggplot2)
library(scales)
library(ggsn)
library(gridExtra)
library(ggmap)
library(ggrepel)
library(fst)
library(USAboundaries)

Next, we will use the create_dirs() function that creates a project folder with all the necessary sub folders. We recommend that you begin by setting up your project with this function. Provide the path to where you would like your project to live in the location argument. By default, if you do not specify the path, this function will create a main directory on your desktop. It will also assign paths to string variables in your environment.

This will set up is the following folders and paths to them :

main: the main folder where the project will be located.
input: the input that we need for calculations.
- zcta_500k: ZCTA (A Zip Code Tabulation Area) shape files
- hpbl: monthly global planetary boundary layer files.
- meteo: (reanalysis) meteorology files
output
- hysplit: disperseR output (one file for each emissions event)
- ziplink: files containing ZIP code linkages
- rdata: RData files containing HyADS source-receptor matrices
- exp: exposure per zipcode data
- graph: graphs saved here as pdf when running functions
process: temporary files that are created when the model is running and then deleted

Note that not specifying the location argument will lead to a default Desktop location for the project folder. You can specify another folder by typing: disperseR::create_dirs(location="/Users/username/projectlocation") instead.

You should run this command before each disperseR session because the function defines global environment variables pointing to these directories. Don't worry, your previous data will not be overwritten.

disperseR::create_dirs()

The inputs

The next step is to get the data necessary for the analysis. Let's start by loading data sets that are provided inside the disperseR package.

Get the data with the `get_data()` function.

You can get most of the data required for the analysis by using the following function. This function will download the data necessary and for the data that is already attached with the package it will automatically assign it to variables in your R environment. If you want to load the data step by step check our vignette here. It also contains more information about the data and their sources.

The arguments start.year, start.month,end.year, and end.month are necessary to download the meteorology reanalysis files. They will be downloaded if they are not already in the meteo_dir folder. The reanalysis met files are about 120 MB each.

If you, for example, you want to download files for Jan-March 2005, you just have to use the get_data() function and set data = "all", start.year = "2005", start.month = "01", end.year = "2005", and end.month = "03". See below.

disperseR::get_data(data = "all", 
  start.year = "2005", 
  start.month = "11", 
  end.year = "2006", 
  end.month = "02")

The units data

The units data should be loaded separately so that you are able to select which units to process.

This package contains annual emissions and stack height data from EPA's Air Markets Program Data and the Energy Information Agency for years 2003-2012. Again, if you would like to know how these data were prepared please see the special vignette that we have attached to this package. Access it here

You can visualize the data like this in RStudio:

View(disperseR::units)

Now, we need to select the power plants whose impact we will analyse. In this case, we'll use the two units in 2005 with the greatest SOx emissions, and two with the greatest SOx emissions in 2006. You are free to choose your own units. You can see that unit "3136-1" is on top of the ranking in both years.

unitsrun2005 <- disperseR::units %>% 
  dplyr::filter(year == 2005) %>% # only get data for 2005
  dplyr::top_n(2, SOx)  # sort and take the two rows with the biggest value for SOx

unitsrun2006 <- disperseR::units %>% 
  dplyr::filter(year == 2006) %>%  # only get data for 2006
  dplyr::top_n(2, SOx)  # sort and take the two rows with the biggest value for SOx

head(unitsrun2005)

# append together and transform to data table 
unitsrun<-data.table::data.table(rbind(unitsrun2005, unitsrun2006))

Analysis

Define time periods to be run

HYSPLIT, as applied here, tracks air parcels emitted at certain times and locations. disperseR refers to these as emissions events. Once HYSPLIT has been run for each emissions event, the simulated parcel locations are aggregated by source, time, and location. The functions below are written to enable runs of many emissions events.

To define an object that includes all emission events in a given time period, we can use the helper function define_inputs(). This takes as inputs a starting and ending day, and outputs a table of values whose rows will later correspond to inputs into the main disperseR functions. The following command combines the units defined above with four times a day for January-July in 2005 (we show an example for this short period of time here to keep the computation manageable for a desktop computer). Four daily emissions events are defined by the startday argument, and duration = 120 denotes that the emitted air parcels are tracked for 120 hours (5 days). Note that most application of HyADS to date have used duration = 240 (10 days); we use 5 days here to keep this example snappy.

The result of running this function is a data.table that you can see below.

input_refs <- disperseR::define_inputs(units = unitsrun,
  startday = '2005-11-01',
  endday = '2006-02-28',
  start.hours =  c(0, 6, 12, 18),
  duration = 120)

head(input_refs)

Run HYSPLIT

The following examples show how to run a small subset (chosen to provide workable examples on a laptop) of emissions events, link to spatial domains---U.S. ZIP codes, U.S. counties, or a regular grid---and plot the results.

The run_disperser_parallel() function runs HYSPLIT for each emissions event. Inputs defined above, including the projected ZCTA shape file (zcta), the ZIP-ZCTA crosswalk (crosswalk), the planetary boundary layer raster (pblheight), are necessary in the function call.

If you get an error running HYSPLIT, it is likely because the meteorology files did not download correctly. Check your meteo_dir to ensure all files are present and at least 100MB in size.

Below we subset the data to run the process on a local computer. We only choose the first day and first hour of each month for each of the units that we chose. You can choose the events you are interested in.

input_refs_subset <- input_refs[format(as.Date(input_refs$start_day,
  format = "%Y-%m-%d"),
  format = "%d") == "01" & start_hour == 0]

input.refs should be the data table that is the result of the define_input() function. Remember that here we are running the model just on a subset so we will set input.refs = input_refs_subset

The pbl.height argument is NULL by default but you should provide it for the function to run properly. We use the pblheight data.

The package has the possibility to use two types of species. The default one is species = 'so2', but you can also use particulate sulfate species = 'so4p'.

The parallel::detectCores() function automatically detects the number of cores on your machine. If you would like to change on how many cores the model should be run please specify the number of cores. Here, for example we specified the number of cores to be the number of the cores on the machine. If you would like to run this model sequentially please set mc.cores = 1.

It is possible that running the below code with output a warning "WARNING: map background file not found ../graphics/arlmap". It is safe to ignore it.

hysp_raw <- disperseR::run_disperser_parallel(input.refs = input_refs_subset,
   pbl.height = pblheight,
   species = 'so2',
   proc_dir = proc_dir,
   overwrite = FALSE, ## FALSE BY DEFAULT
   npart = 100,
   keep.hysplit.files = FALSE, ## FALSE BY DEFAULT
   mc.cores = parallel::detectCores())

Link results to spatial domains

Most current implementations of HyADS, instead of linking dispersion patterns from individual emissions events, link the patterns by month. With the link_all_units() function, users can link all air parcels to three spatial domains---U.S. ZIP codes, U.S. counties, or a regular grid---by month for specified units. Here, we define the variables yearmons with combinations of years and months. link_all_units() uses the support functions disperser_link_zips(), disperser_link_counties(), disperser_link_grids() to read in raw HYSPLIT output files produced by the run_disperser_parallel() and links the corresponding parcel locations to the appropriate spatial domain. The user selects the spatial domain through the link.to argument---it can be set to one of "zips", "counties", or "grids".

link_all_units() can be set to paralelize the runs by splitting different months on different cores, but you can make it serial just by setting the mc.cores = 1. The result of link_all_units() is a data.table of ZIP codes, counties, or grids and relative contributions N, and an identical .csv file is saved to the folder with a path specified by the ziplink_dir variable. N is not weighted by the unit's emissions.

Below we present how to use link_all_units() for chosen units. First, we create an argument (called yearmons here) denoting the year and month combinations for which we would like to calculate unweighted exposure. Here we only choose July 2005 through June 2006 to cover all of the example runs specified above, and we create the variable using the get_yearmon() function.

yearmons <- disperseR::get_yearmon(start.year = "2005", 
  start.month = "07", 
  end.year = "2006", 
  end.month = "06")

Let's look again at our unitsrun data set.

unitsrun

We only have three units in our unitsrun data set. "3136-1", "3136-2" and "3149-1". We will run link_all_units() for all of them (it is possible to pass link_all_units() a single unit if desired). The function will return only the values for units and months that could have been linked to one another.

If you want to run the function for a specific start and end date you can specify the start.date and the end.date arguments. For example, start.date="2005-01-02" for 2 January 2005. These arguments are set to NULL by default and the function computes the start and the end dates using the year.mons provided.

pbl.height = pblheight by default but you can change it. The same applies to crosswalk. = crosswalk (crosswalk. is only required for linking to ZIP codes). counties. is required for linking to counties, and is set as counties. = USAboundaries::us_counties( ) by default. If you'd only like to save time and only link to ZIP codes or counties in a subset of the country, provide crosswalk. or counties. that correspond to the domain of interest.

Changing duration.run.hours to values less than that applied in run_disperser_parallel() allows the exploration of air parcel locations in closer time proximity to each emissions event.

res.link sets the grid resolution in meters (default is 12000 or 12km). Since parcel locations are first assigned to a grid before further spatial allocation, this setting has the potential to impact all results. A higher res.link with link.to = "grids" will result in faster run time and smaller output files.

linked_zips <- disperseR::link_all_units(
  units.run = unitsrun,
  link.to = 'zips',
  mc.cores = parallel::detectCores(),
  year.mons = yearmons,
  pbl.height = pblheight,
  crosswalk. = crosswalk,
  duration.run.hours = 20,
  res.link = 12000,
  overwrite = FALSE)

# link all units to counties
linked_counties <- disperseR::link_all_units(
  units.run=unitsrun,
  link.to = 'counties',
  mc.cores = parallel::detectCores(),
  year.mons = yearmons,
  pbl.height = pblheight,
  counties. = USAboundaries::us_counties( ),
  crosswalk. = NULL,
  duration.run.hours = 20,
  overwrite = FALSE)

# link all units to grids
linked_grids <- disperseR::link_all_units(
  units.run=unitsrun,
  link.to = 'grids',
  mc.cores = parallel::detectCores(),
  year.mons = yearmons,
  pbl.height = pblheight,
  crosswalk. = NULL,
  duration.run.hours = 20,
  overwrite = FALSE)


head(linked_zips)
head(linked_counties)
head(linked_grids)

It could be the case that data could not have been linked for a specific month and unit. Below we show how to take a look at the combinations of units and months that were successfully linked. We can easily do it in the following way:

unique(linked_zips$comb)

Visualization of the results.

At this point, it makes sense to take a look at the results to see which ZIP codes are impacted by emissions from the facility whose impacts we modeled above. We can see how the impact looks by looking at numbers and plots. We will first just look at the numbers. For this purpose, we created a function create_impact_table_single() that generates a specific data.table.

And now we can use the create_impact_table_single() function to see the computed impact for a single month and a single unit.

Keep in mind, since we only modeled a subset of the potential model runs defined in input_refs(), the impacts contained in impact_table_single and plotted below include only results from a single emissions event. Further, the plotted values are "HyADS raw exposure", i.e., not weighted by emissions. So, a value of 0.01 represents 10x more exposure than a value of 0.001, but only in relation to unit 3136-1.

We select the spatial link based on the link.to argument. The function returns data.tables that include an sf geometry column for easy plotting.

impact_table_zip_single <- disperseR::create_impact_table_single(
  data.linked=linked_zips,
  link.to = 'zips',
  data.units = unitsrun,
  zcta.dataset = zcta_dataset,
  map.unitID = "3136-1",
  map.month = "200511",
  metric = 'N')
impact_table_county_single <- disperseR::create_impact_table_single(
  data.linked=linked_counties,
  link.to = 'counties',
  data.units = unitsrun,
  counties. = USAboundaries::us_counties( ),
  map.unitID = "3136-1",
  map.month = "200511",
  metric = 'N')
impact_table_grid_single <- disperseR::create_impact_table_single(
  data.linked=linked_grids,
  link.to = 'grids',
  data.units = unitsrun,
  map.unitID = "3136-1",
  map.month = "200511",
  metric = 'N')

head(impact_table_zip_single)

In a similar way we can just plot the results using the plot_impact_single() function. A customized title can be set using the plot.name argument. Additional inputs can be passed to ggplot2's theme function to customize your plot.

zoom = T by default and is authomatically adjusted depending on the plot values. You can set zoom = F to display the map of the whole country. Note that the linked grids extend beyond the US boarders because any grid across the world may be impacted by the given facilities.

link_plot_zips <- disperseR::plot_impact_single(
  data.linked = linked_zips,
  link.to = 'zips',
  map.unitID = "3136-1",
  map.month = "20061",
  data.units = unitsrun,
  zcta.dataset = zcta_dataset,
  metric = 'N',
  graph.dir = graph_dir,
  zoom = T, # TRUE by default
  legend.name = 'HyADS raw exposure',
  # other parameters passed to ggplot2::theme()
  axis.text = element_blank(),
  legend.position = c( .75, .15))
link_plot_grids <- disperseR::plot_impact_single(
  data.linked = linked_grids,
  link.to = 'grids',
  map.unitID = "3136-1",
  map.month = "20061",
  data.units = unitsrun,
  metric = 'N',
  graph.dir = graph_dir,
  zoom = F, # TRUE by default
  legend.name = 'HyADS raw exposure',
  # other parameters passed to ggplot2::theme()
  axis.text = element_blank(),
  legend.position = c( .75, .15))
link_plot_counties <- disperseR::plot_impact_single(
  data.linked = linked_counties,
  link.to = 'counties',
  map.unitID = "3136-1",
  map.month = "20061",
  counties. = USAboundaries::us_counties( ),
  data.units = unitsrun,
  metric = 'N',
  graph.dir = graph_dir,
  zoom = T, # TRUE by default
  legend.name = 'HyADS raw exposure',
  # other parameters passed to ggplot2::theme()
  axis.text = element_blank(),
  legend.position = c( .75, .15))

link_plot_zips
link_plot_grids
link_plot_counties

Combine all results into RData file.

So far, we have run two worker functions important to the implementation of HyADS:

run_disperser_parallel() ran HYSPLIT for unit-date-time emission event combinations and saved an output file for each gathered output files from run_disperser_parallel().
link_all_units() function grouped them by month and unit, and linked the HYSPLIT parcel locations contained in the files to various three different scales.

In practice, these two worker functions may need to be implemented in parallel R sessions on a cluster to improve efficiency. It helps for future analyses to gather the relevant monthly files and save them as a single RData file. This is made possible by combine_monthly_links(), which saves an RData file of annual monthly results to the rda_dir. Names of the resulting objects give the year-month combinations in each object.

combined_ziplinks <- disperseR::combine_monthly_links(
  month_YYYYMMs = yearmons,
  link.to = 'zips',
  filename = 'hyads_vig_unwgted_zips.RData')
combined_countylinks <- disperseR::combine_monthly_links(
  month_YYYYMMs = yearmons,
  link.to = 'counties',
  filename = 'hyads_vig_unwgted_counties.RData')
combined_gridlinks <- disperseR::combine_monthly_links(
  month_YYYYMMs = yearmons,
  link.to = 'grids',
  filename = 'hyads_vig_unwgted_grids.RData')
names(combined_ziplinks)

Calculate and extract useful information from the results

Up until now, this vignette has focused on producing unweighted HyADS results. Once the procedure above has been followed, it is fairly straightforward to weight the unweighted HyADS results by emissions (or other quantity) and aggregate the results by source, receptor, and/or time period.

Weight the results by emissions

The create_impact_table_weighted() function takes as input the .RData file created by combine_monthly_links() and monthly power plant emissions. The user can select the disperseR year (year.D), the emissions year (year.E), the weighting pollutant (although it does not necessarily have to be a pollutant), the source aggregation (you can choose source.agg = "total", source.agg = "facility", or source.agg = "unit"), the time aggregation (time.agg = "month" or time.agg = "year").

Here, we first calculate the annual HyADS emissions weighted exposure for all emissions modeled above and plot the results. The plots require a large amount of working memory, and may require you to clear your image history (if using RStudio).

exp_ann_unit_zip <- disperseR::calculate_exposure(
  year.E = 2005,
  year.D = 2005,
  link.to = 'zips',
  pollutant = 'SO2.tons',
  rda_file = file.path(rdata_dir, "hyads_vig_unwgted_zips.RData"),
  exp_dir = exp_dir,
  units.mo = PP.units.monthly1995_2017,
  source.agg = 'unit',
  time.agg = 'month',
  return.monthly.data = T)

exp_ann_unit_grids <- disperseR::calculate_exposure(
  year.E = 2005,
  year.D = 2005,
  link.to = 'grids',
  pollutant = 'SO2.tons',
  rda_file = file.path(rdata_dir, "hyads_vig_unwgted_grids.RData"),
  exp_dir = exp_dir,
  units.mo = PP.units.monthly1995_2017,
  source.agg = 'unit',
  time.agg = 'month',
  return.monthly.data = T)

exp_ann_unit_counties <- disperseR::calculate_exposure(
  year.E = 2005,
  year.D = 2005,
  link.to = 'counties',
  pollutant = 'SO2.tons',
  rda_file = file.path(rdata_dir, "hyads_vig_unwgted_counties.RData"),
  exp_dir = exp_dir,
  units.mo = PP.units.monthly1995_2017,
  source.agg = 'unit',
  time.agg = 'month',
  return.monthly.data = T)

We can plot the results as following using the plot_impact_weighted() function. Again, a customized title can be set using the plot.name argument, and addional arguments can be passed to ggplot2's theme function to customize.

Same as before, zoom = T by default and is authomatically adjusted depending on the plot values. You can set zoom = F to display the map of the whole country.

Please note that the produced graph will be automatically saved in the folder specified in the graph.dir argument. Here we set graph.dir = graph_dir. graph_dir should have been automatically created by the create_dirs() function.

zip_exp_ann_plot <- disperseR::plot_impact_weighted(
  data.linked = exp_ann_unit_zip,
  data.units = unitsrun,
  link.to = 'zips',
  zcta.dataset = zcta_dataset,
  time.agg = 'year',
  metric = 'hyads',
  legend.name = 'Aggregate HyADS exposure',
  zoom = T, # TRUE by default
  graph.dir = graph_dir,
  map.month = NULL, # NULL by default change if time.agg = 'month'
  # other parameters passed to ggplot2::theme()
  axis.text = element_blank(),
  legend.position = c( .75, .15)) # 0 by default

counties_exp_ann_plot <- disperseR::plot_impact_weighted(
  data.linked = exp_ann_unit_counties,
  data.units = unitsrun,
  link.to = 'counties',
  counties. = USAboundaries::us_counties( ),
  time.agg = 'year',
  metric = 'hyads',
  legend.name = 'Aggregate HyADS exposure',
  zoom = T, # TRUE by default
  graph.dir = graph_dir,
  map.month = NULL, # NULL by default change if time.agg = 'month'
  # other parameters passed to ggplot2::theme()
  axis.text = element_blank(),
  legend.position = c( .75, .15)) # 0 by default

grids_exp_ann_plot <- disperseR::plot_impact_weighted(
  data.linked = exp_ann_unit_grids,
  data.units = unitsrun,
  link.to = 'grids',
  time.agg = 'year',
  metric = 'hyads',
  legend.name = 'Aggregate HyADS exposure',
  zoom = T, # TRUE by default
  graph.dir = graph_dir,
  map.month = NULL, # NULL by default change if time.agg = 'month'
  # other parameters passed to ggplot2::theme()
  axis.text = element_blank(),
  legend.position = c( .75, .15)) # 0 by default

zip_exp_ann_plot
counties_exp_ann_plot
grids_exp_ann_plot

Let's now see how to plot a monthly exposure.

Results are saved to the exp_dir and returned, but return.monthly.data must be set to true to return data when time.agg = 'month' (the default was set to save working memory space).

It is very important to provide a month for which we have available data

exp_mon_unit_zip <- disperseR::calculate_exposure(
  year.E = 2005,
  year.D = 2005,
  link.to = 'zips',
  pollutant = 'SO2.tons',
  rda_file = file.path(rdata_dir, "hyads_vig_unwgted_zips.RData"),
  exp_dir = exp_dir,
  units.mo = PP.units.monthly1995_2017,
  source.agg = 'unit',
  time.agg = 'month',
  return.monthly.data = T)

exp_mon_unit_grids <- disperseR::calculate_exposure(
  year.E = 2005,
  year.D = 2005,
  link.to = 'grids',
  pollutant = 'SO2.tons',
  rda_file = file.path(rdata_dir, "hyads_vig_unwgted_grids.RData"),
  exp_dir = exp_dir,
  units.mo = PP.units.monthly1995_2017,
  source.agg = 'unit',
  time.agg = 'month',
  return.monthly.data = T)

exp_mon_unit_counties <- disperseR::calculate_exposure(
  year.E = 2005,
  year.D = 2005,
  link.to = 'counties',
  pollutant = 'SO2.tons',
  rda_file = file.path(rdata_dir, "hyads_vig_unwgted_counties.RData"),
  exp_dir = exp_dir,
  units.mo = PP.units.monthly1995_2017,
  source.agg = 'unit',
  time.agg = 'month',
  return.monthly.data = T)

zip_exp_mon_plot <- disperseR::plot_impact_weighted(
  data.linked = exp_mon_unit_zip,
  data.units = unitsrun,
  zcta.dataset = zcta_dataset,  
  time.agg = 'month',
  map.month = "200511", 
  metric = 'hyads',
  legend.name = 'Montly HyADS exposure', 
  zoom = T, # TRUE by default
  graph.dir = graph_dir)

zip_exp_mon_plot

Plot unit-specific impacts over time

Next, we calculate the monthly emissions weighted HyADS exposure (again, only from the select few runs we initiated above). The code below plots the monthly exposure contributions of the three units in ZIP codes 17815, 08210, 13849, 17229. The location of the zip codes can be seen on the small map on the right side of the plot. Please note that the axis values are different for different ZIP codes.

zip_exp_unit_mon2005 <- disperseR::calculate_exposure(
  rda_file = file.path(rdata_dir, "hyads_vig_unwgted_zips.RData"),
  units.mo = PP.units.monthly1995_2017,
  link.to = 'zips',
  year.E = 2005,
  year.D = 2005,
  pollutant = 'SO2.tons',
  source.agg = 'unit', # note! 
  time.agg = 'month',
  exp_dir = exp_dir,
  return.monthly.data = T)


zip_exp_unit_mon2006 <- disperseR::calculate_exposure(
  rda_file = file.path(rdata_dir, "hyads_vig_unwgted_zips.RData"),
  units.mo = PP.units.monthly1995_2017,
  link.to = 'zips',
  year.E = 2006,
  year.D = 2006,
  pollutant = 'SO2.tons',
  source.agg = 'unit', # note! 
  time.agg = 'month',
  exp_dir = exp_dir,
  return.monthly.data = T)

zip_exp_unit_mon <- rbind(zip_exp_unit_mon2005, zip_exp_unit_mon2006)

And now we can plot the results

zipcodes <- c("13039","21798", "03804")

zip_exp_unit <- disperseR::plot_impact_unit(
  data.linked = zip_exp_unit_mon, 
  zip.codes = zipcodes,
  graph.dir = graph_dir)

Rank facilities.

In the examples provided above, we selected coal units a priori to analyze. However, a benefit of HyADS over more complex models is the ability to easily rank-order emissions sources to identify the sources that contribute the most to exposure in a given area.

The following code snippet applies the calc_exposure() function to calculate annual impacts by unit (here, only the two units selected above), then rank-orders their population-weighted exposures on Pennsylvania using the rankfacs_by_popwgt_location() function.

zip_exp_ann_unit <- disperseR::calculate_exposure(
  year.E = 2005,
  year.D = 2005,
  link.to = 'zips',
  pollutant = 'SO2.tons',
  rda_file = file.path(rdata_dir, "hyads_vig_unwgted_zips.RData"),
  exp_dir = exp_dir,
  units.mo = PP.units.monthly1995_2017,
  source.agg = 'unit',
  time.agg = 'year')
# add column year for proper merge
zip_exp_ann_unit[, year := 2005]

In this function, the input ZIP-code linkage dataset is defined by zip_exp_ann_unit; the rank.by argument defines the variables in zip_exp_ann_unit to rank by—it is possible to define multiple values (e.g., rank.by = c( 'hyads', 'SOx')) to rank by multiple ranking variables if they are available; finally, state.value, city.value, and zip.value can be defined to limit the area of influence (here, we only rank these units’ influences on Pennsylvania with state.value = 'PA'). For now, this function is only available for ZIP codes.

unitRanks2005 <- disperseR::rankfacs_by_popwgt_location(
  data.linked = zip_exp_ann_unit,
  crosswalk. = crosswalk,
  rank.by = c('hyads'),
  state.value = 'PA', 
  year = 2005)

unitRanks2005

Plot ranked facilities.

plotUnitsRanked() works very similar to other plotting function in disperseR. Choose the dataset of ranked facilities as well as year.

The function below results in two graphs. One is a map showing where the units are and showing their rank and the other one contains bar plots.

plotUnitsRanked <- disperseR::plot_units_ranked(
  data.units = unitsrun, 
  data.ranked = unitRanks2005,
  year = 2005, 
  graph.dir = graph_dir)

plotUnitsRanked

Please note graphical output is authomatically saved to the graph_dir by the plotting functions.

lhenneman/disperseR documentation built on Nov. 14, 2021, 6:24 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lhenneman/disperseR
Run HYSPLIT many times in parallel, aggregate to zip code level, plot the results, save the plots.

In lhenneman/disperseR: Run HYSPLIT many times in parallel, aggregate to zip code level, plot the results, save the plots.

The step up

The inputs

Get the data with the `get_data()` function.

The units data

Analysis

Define time periods to be run

Run HYSPLIT

Link results to spatial domains

Visualization of the results.

Combine all results into RData file.

Calculate and extract useful information from the results

Weight the results by emissions

Plot unit-specific impacts over time

Rank facilities.

Plot ranked facilities.

R Package Documentation

Browse R Packages

We want your feedback!

lhenneman/disperseR Run HYSPLIT many times in parallel, aggregate to zip code level, plot the results, save the plots.

In lhenneman/disperseR: Run HYSPLIT many times in parallel, aggregate to zip code level, plot the results, save the plots.

The step up

The inputs

Get the data with the get_data() function.

The units data

Analysis

Define time periods to be run

Run HYSPLIT

Link results to spatial domains

Visualization of the results.

Combine all results into RData file.

Calculate and extract useful information from the results

Weight the results by emissions

Plot unit-specific impacts over time

Rank facilities.

Plot ranked facilities.

R Package Documentation

Browse R Packages

We want your feedback!

lhenneman/disperseR
Run HYSPLIT many times in parallel, aggregate to zip code level, plot the results, save the plots.

Get the data with the `get_data()` function.