knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
disperseR
is an R package designed based on the hyspdisp package and the SplitR package. It is very important to note that many functions in disperseR
are just sightly redesigned functions from the two mentioned packages.
disperseR
runs the HYSPLIT many times and calculates the HYSPLIT Average Dispersion (or HyADS) exposure metric.
The results can then be aggregated to ZIP code level to create national estimates of exposure from various sources. disperseR
includes functions that make it possible for the user to plot the results easily.
Thanks to the hyspdisp package, for example, plumes from several power plants can be tracked for many days and cumulative impacts estimated. disperseR laverages hyspdisp package and allows the user have a more friendly interaction with the package.
disperseR
is a new version of the hyspdisp
package. What has been improved?
Input data manipulation is handled at the package level. The user only has to read the data in using the disperseR::get_data()
function. We show how to do it in the main vignette.
We also created additional vignettes should the user want to see how the attached data was preprocessed. We show every single step of preprocessing starting from the step of data download. This is key for reproducible research.
Very clear project struture and automatization does not make the user lost in the maze of multiple folders. The disperseR::create_dirs()
automatically creates the whole project structure either in the specified location or on the desktop. The function also assigns path to each folder to the R environment. These paths are then used by other disperseR
functions. Note that the disperseR::create_dirs()
function does not overwrite the project folders if they already exists in the specified location.
Until now the units
data for different years was separated and only four years of data were available with the package. Now data for years 1995 to 2015 has been added and aggregated to one data file called units
attached to disperseR
.
ZIP code linkage procedure requires a ZCTA-to-ZIP code crosswalk file. These crosswalk data has also been attached to the package. It not only provides the crosswalk between ZCTA and ZIP but also contains information about population sizes.
Before the user could only run analysis for one year. disperseR
allows to process all the needed years together.
Graph functions now have many automatic features.
Documentation has been much improved. The ?FUNCTION
syntax should work to access help files.
We know it is sometimes difficult to start working with a new package, especially if you are not very familiar with R. We also believe in reproducible research. This is why we have included several vignettes to help you with the process.
Unfortunatelly, disperseR
requires a lot of data to run the models. We could not include all the data sets with the package. For example the ZCTA shapefile is more than 140 MB. You can access it very simply with the help of the disperseR::get_data()
function. Here however are the data that are attached:
crosswalk: ZIP code linkage procedure requires a ZCTA-to-ZIP code crosswalk file. ZCTAs are not exact geographic matches to ZIP codes, and multiple groups compile and maintain Crosswalk files. We used the Crosswalk maintained by UDS Mapper and prepossessed it also including information about the population size. While not necessary for the HYSPLIT model or processing of its outputs, population-weighted exposure metrics allow for direct comparisons between power plants. If you would like to know more details about how this crosswalk was prepared, we have attached a vignette that explains it. You can see it by clicking here.
PP.units.monthly1995_2017 : The disperseR
package also includes monthly power plant emissions, load, and heat input data. (we currently do not have a vignette for these data due to server problems of the data owner). This will be updated as soon as possible.
units(data for 1995-2015): This package contains annual emissions and stack height data from EPA's Air Markets Program Data and the Energy Information Agency. Again, if you would like to know how these data were prepared please see the special vignette that we have attached to this package. You can see it by clicking here.
zipcode coordinate data: The disperseR
package contains a data set with coordinates of ZIP codes. This might be useful for plotting, but it is not necessary as it will be used automatically by our plotting functions where required. Please click here for more information.
disperseR
has functions that let you plot your results. Here is just one of many examples.
First, not having the Rcpp
package installed on your computer can lead to problems with disperseR
installation (problems with version installation). We recommend you first type the following into your R console.
install.packages("Rcpp")
Please noteIf you are using a Windows machine and you want R to render the vignettes for you, you will need to download Rtools from here. If you prefer to avoid this step you can go ahead and proceed with the instalation as we have added links to access already rendered vignettes on GitHub.
Continue by typing the following in your R console. This will download the package from GitHub, install it and build the vignettes. This might take some minutes.
devtools::install_github("lhenneman/disperseR", force = TRUE, build_vignettes = TRUE)
Load disperseR
into your R session.
library(disperseR)
The audiracmichelle/disperser image has Rstudio and all the R and unix dependencies already installed to run disperseR
quickly and reliably. The image is based on rocker project (https://www.rocker-project.org/).
More information on disperseR
docker image is found in this DockerHub site https://hub.docker.com/r/audiracmichelle/disperser or in this GitHub repository https://github.com/audiracmichelle/docker_disperser.
You should be able to see the main vignette like this. This will be opened by your RStudio.
vignette("Vignette_DisperseR")
The rest of the vignettes can be accessed by typing the corresponding commands.
vignette("Vignette_Crosswalk_Preparation") vignette("Vignette_Load_Data_One_by_One") vignette("Vignette_Units_Preparation") vignette("Vignette_Zip_Code_Coordinate_Data_Preparation") vignette("Vignette_Planetary_Layers_Data_Preparation") vignette("Vignette_ZCTA_Shapefile_Preparation")
NOTE: IF THIS DOES NOT WORK:
In case this does not work for you. We have rendered all the vignettes for you and you can access them from your browser by clicking at the corresponding hyperlinks in Vignettes attached with the package section above.
The vignettes will instruct you to do so but you can already start by creating the project folder. Use disperseR::create_dirs()
function to do so. Point disperseR
to the location where you want your project to be created. For example the following code will create the project in the user's Dropbox. If you do not specify the location and just type disperseR::create_dirs()
it will still work and the project will be created on your desktop.
disperseR::create_dirs(location = "/Users/username/Dropbox")
This will set up is the following folders and paths to them :
main
: the main folder where the project will be located. input
: the input that we need for calculations. zcta_500k
: ZCTA (A Zip Code Tabulation Area) shape fileshpbl
: monthly global planetary boundary layer files.meteo
: (reanalysis) meteorology filesoutput
hysplit
: disperseR output (one file for each emissions event)ziplink
: files containing ZIP code linkagesrdata
: RData files containing HyADS source-receptor matricesexp
: exposure per zipcode data graph
: graphs saved here as pdf when running functionsprocess
: temporary files that are created when the model is running and then deletedHere is a screen shot of what it should look like:
And these are the variables with paths that will appear in your environment.
You can get most of the data required for the analysis by using the following function. This function will download the data necessary and for the data that is already attached with the package it will automatically assign it to variables in your R environment. If you want to load the data step by step check our vignette here. It also contains more information about the data and their sources.
The arguments start.year
, start.month
,end.year
, and end.month
are necessary to download the meteorology reanalysis files. They will be downloaded if they are not already in the meteo_dir
folder. The reanalysis met files are about 120 MB each.
If you, for example, you want to download files for January-March 2005, you just have to use the get_data()
function and set data = "all"
, start.year = "2005"
, start.month = "01"
, end.year = "2005"
, and end.month = "03"
. See below.
disperseR::get_data(data = "all", start.year = "2005", start.month = "01", end.year="2005", end.month="03")
If it runs correctly you should see the following in our R environment.
The units data should be loaded separately so that you are able to select which units to process.
This package contains annual emissions and stack height data from EPA's Air Markets Program Data and the Energy Information Agency for years 2003-2012. Again, if you would like to know how these data were prepared please see the special vignette that we have attached to this package. Access it here
You can visualize the data like this in RStudio:
view(disperseR::units)
Please note: If you decide to use a specific unit but for many years you must have a row of data for each year. For example this is out data from the main vignette. Look at row 1 and row 3. They contain data for the same unit but a different year.
We suggest you have a look at our main vignette here for details about the analysis.
Graphical output is authomatically saved to the graph_dir
by the plotting functions.
NCEP Reanalysis data provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA, from their Web site at https://www.esrl.noaa.gov/psd/ Mesinger, F., G. DiMego, E. Kalnay, K. Mitchell, P.C. Shafran, W. Ebisuzaki, D. Jović, J. Woollen, E. Rogers, E.H. Berbery, M.B. Ek, Y. Fan, R. Grumbine, W. Higgins, H. Li, Y. Lin, G. Manikin, D. Parrish, and W. Shi, 2006: North American Regional Reanalysis. Bull. Amer. Meteor. Soc., 87, 343–360, https://doi.org/10.1175/BAMS-87-3-343
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.