11 Sep 2019
disperseR
is an R package designed based on the hyspdisp
package and the SplitR
package. It is very
important to note that many functions in disperseR
are just sightly
redesigned functions from the two mentioned packages.
disperseR
runs the HYSPLIT
many times and calculates the HYSPLIT Average Dispersion (or HyADS)
exposure metric. The results can then be aggregated to ZIP code level to
create national estimates of exposure from various sources. disperseR
includes functions that make it possible for the user to plot the
results easily.
Thanks to the hyspdisp package, for example, plumes from several power plants can be tracked for many days and cumulative impacts estimated. disperseR laverages hyspdisp package and allows the user have a more friendly interaction with the package.
disperseR
is a new version of the hyspdisp
package. What has been
improved?
Input data manipulation is handled at the package level. The user
only has to read the data in using the disperseR::get_data()
function. We show how to do it in the main vignette.
We also created additional vignettes should the user want to see how the attached data was preprocessed. We show every single step of preprocessing starting from the step of data download. This is key for reproducible research.
Very clear project struture and automatization does not make the
user lost in the maze of multiple folders. The
disperseR::create_dirs()
automatically creates the whole project
structure either in the specified location or on the desktop. The
function also assigns path to each folder to the R environment.
These paths are then used by other disperseR
functions. Note
that the disperseR::create_dirs()
function does not overwrite the
project folders if they already exists in the specified location.
Until now the units
data for different years was separated and
only four years of data were available with the package. Now data
for years 1995 to 2015 has been added and aggregated to one data
file called units
attached to disperseR
.
ZIP code linkage procedure requires a ZCTA-to-ZIP code crosswalk file. These crosswalk data has also been attached to the package. It not only provides the crosswalk between ZCTA and ZIP but also contains information about population sizes.
Before the user could only run analysis for one year. disperseR
allows to process all the needed years together.
Graph functions now have many automatic features.
Documentation has been much improved. The ?FUNCTION
syntax should
work to access help files.
We know it is sometimes difficult to start working with a new package, especially if you are not very familiar with R. We also believe in reproducible research. This is why we have included several vignettes to help you with the process.
Unfortunatelly, disperseR
requires a lot of data to run the models. We
could not include all the data sets with the package. For example the
ZCTA shapefile is more than 140 MB. You can access it very simply with
the help of the disperseR::get_data()
function. Here however are the
data that are attached:
crosswalk: ZIP code linkage procedure requires a ZCTA-to-ZIP code crosswalk file. ZCTAs are not exact geographic matches to ZIP codes, and multiple groups compile and maintain Crosswalk files. We used the Crosswalk maintained by UDS Mapper and prepossessed it also including information about the population size. While not necessary for the HYSPLIT model or processing of its outputs, population-weighted exposure metrics allow for direct comparisons between power plants. If you would like to know more details about how this crosswalk was prepared, we have attached a vignette that explains it. You can see it by clicking here.
PP.units.monthly1995_2017 : The disperseR
package also
includes monthly power plant emissions, load, and heat input data.
(we currently do not have a vignette for these data due to server
problems of the data owner). This will be updated as soon as
possible.
units(data for 1995-2015): This package contains annual emissions and stack height data from EPA’s Air Markets Program Data and the Energy Information Agency. Again, if you would like to know how these data were prepared please see the special vignette that we have attached to this package. You can see it by clicking here.
zipcode coordinate data: The disperseR
package contains a data
set with coordinates of ZIP codes. This might be useful for
plotting, but it is not necessary as it will be used automatically
by our plotting functions where required. Please click
here
for more information.
disperseR
has functions that let you plot your results. Here is just
one of many examples.
First, not having the Rcpp
package installed on your computer can lead
to problems with disperseR
installation (problems with version
installation). We recommend you first type the following into your R
console.
install.packages("Rcpp")
**Please noteIf you are using a Windows machine and you want R to render the vignettes for you, you will need to download Rtools from here. If you prefer to avoid this step you can go ahead and proceed with the instalation as we have added links to access already rendered vignettes on GitHub.
Continue by typing the following in your R console. This will download the package from GitHub, install it and build the vignettes. This might take some minutes.
devtools::install_github("lhenneman/disperseR", force = TRUE, build_vignettes = TRUE)
Load disperseR
into your R session.
library(disperseR)
The
audiracmichelle/disperser
image has Rstudio and all the R and unix dependencies already installed
to run disperseR
quickly and reliably. The image is based on rocker
project (https://www.rocker-project.org/).
More information on disperseR
docker image is found in its DockerHub
site https://hub.docker.com/r/audiracmichelle/disperser or in its
GitHub repository https://github.com/audiracmichelle/docker_disperser.
You should be able to see the main vignette like this. This will be opened by your RStudio.
vignette("Vignette_DisperseR")
The rest of the vignettes can be accessed by typing the corresponding commands.
vignette("Vignette_Crosswalk_Preparation")
vignette("Vignette_Load_Data_One_by_One")
vignette("Vignette_Units_Preparation")
vignette("Vignette_Zip_Code_Coordinate_Data_Preparation")
vignette("Vignette_Planetary_Layers_Data_Preparation")
vignette("Vignette_ZCTA_Shapefile_Preparation")
** NOTE: IF THIS DOES NOT WORK:**
In case this does not work for you. We have rendered all the vignettes for you and you can access them from your browser by clicking at the corresponding hyperlinks in Vignettes attached with the package section above.
The vignettes will instruct you to do so but you can already start by
creating the project folder. Use disperseR::create_dirs()
function to
do so. Point disperseR
to the location where you want your project to
be created. For example the following code will create the project in
the user’s Dropbox. If you do not specify the location and just type
disperseR::create_dirs()
it will still work and the project will be
created on your desktop.
disperseR::create_dirs(location = "/Users/username/Dropbox")
This will set up is the following folders and paths to them :
main
: the main folder where the project will be located.input
: the input that we need for calculations.zcta_500k
: ZCTA (A Zip Code Tabulation Area) shape fileshpbl
: monthly global planetary boundary layer files.meteo
: (reanalysis) meteorology filesoutput
hysplit
: disperseR output (one file for each emissions
event)ziplink
: files containing ZIP code linkagesrdata
: RData files containing HyADS source-receptor
matricesexp
: exposure per zipcode datagraph
: graphs saved here as pdf when running functionsprocess
: temporary files that are created when the model is
running and then deletedHere is a screen shot of what it should look like:
And these are the variables with paths that will appear in your environment.
You can get most of the data required for the analysis by using the following function. This function will download the data necessary and for the data that is already attached with the package it will automatically assign it to variables in your R environment. If you want to load the data step by step check our vignette here. It also contains more information about the data and their sources.
The arguments start.year
, start.month
,end.year
, and end.month
are necessary to download the meteorology reanalysis files. They will be
downloaded if they are not already in the meteo_dir
folder. The
reanalysis met files are about 120 MB each.
If you, for example, you want to download files for January-March 2005,
you just have to use the get_data()
function and set data = "all"
,
start.year = "2005"
, start.month = "01"
, end.year = "2005"
, and
end.month = "03"
. See below.
disperseR::get_data(data = "all",
start.year = "2005",
start.month = "01",
end.year="2005",
end.month="03")
If it runs correctly you should see the following in our R environment.
The units data should be loaded separately so that you are able to select which units to process.
This package contains annual emissions and stack height data from EPA’s Air Markets Program Data and the Energy Information Agency for years 2003-2012. Again, if you would like to know how these data were prepared please see the special vignette that we have attached to this package. Access it here
You can visualize the data like this in RStudio:
view(disperseR::units)
Please note: If you decide to use a specific unit but for many years you must have a row of data for each year. For example this is out data from the main vignette. Look at row 1 and row 3. They contain data for the same unit but a different year.
We suggest you have a look at our main vignette here for details about the analysis.
Graphical output is authomatically saved to the graph_dir
by the
plotting functions.
NCEP Reanalysis data provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA, from their Web site at https://www.esrl.noaa.gov/psd/ Mesinger, F., G. DiMego, E. Kalnay, K. Mitchell, P.C. Shafran, W. Ebisuzaki, D. Jović, J. Woollen, E. Rogers, E.H. Berbery, M.B. Ek, Y. Fan, R. Grumbine, W. Higgins, H. Li, Y. Lin, G. Manikin, D. Parrish, and W. Shi, 2006: North American Regional Reanalysis. Bull. Amer. Meteor. Soc., 87, 343–360, https://doi.org/10.1175/BAMS-87-3-343
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.