###opts_knit$set(stop_on_error = 2L)
installIfNeeded <- function(packages, ...) {
    installedPackages <- installed.packages()[, 1]
    toInstall <- setdiff(packages, installedPackages)
    if (length(toInstall) > 0) install.packages(toInstall, ...)
}
installIfNeeded(c("globalPM25",
                  "ggplot2"))
library(ggplot2)
library(globalPM25)

This package is developed for users to access and plot real-time air pollute PM2.5 level near you, at a city, at any location, or in a region around the world. The purpose is to increase overall awareness about the dangerous of air pollution to human health, particularly in some developing countries.

What is PM2.5 and why it is dangerous to you and your family's health

PM2.5 refers to atmospheric particulate matter (PM) that have a diameter less than 2.5 micrometers. PM2.5 can come from various sources. Many experts believe that two main sources are power plants and motor vehicles. The effects of inhaling PM that have been widely studied in humans and animals include asthma, lung cancer, cardiovascular disease, respiratory diseases, premature delivery, birth defects, and premature death. Please refer to the wikipedia page Particulate Matter about PM's health effects and more.

How is PM2.5 measured?

According to US EPA, PM2.5 can be both quantitatively measured by AQI (air quality index) and qualitatively measured by APL (air pollution level).

EPA PM2.5 standard (APL=Air Quality Index Levels of Health Concern, AQI=Numeric Value)

Please refer to US EPA official website airnow.gov for more information. In this package, both AQI and APL are provided.

Data source

The real-time PM2.5 data feed are from WAQ (World Air Quality) Project via JSON API. According to WAQ, "Real-time Air Quality information is available for than 9000 stations in 800 major cities from 70 countries, thanks to the huge effort from the world EPAs (Environmental Protection Agencies)."

To use globalPM25 package to access real-time PM2.5 data feed, users first need to request a personal token from WAQ website. Then the personal token can be set in the package using

settoken()

To request your personal token, please visit this link. For more about JSON API, please visit this link.

Functions and data overview

There are 10 functions/.R files in the package. Below are the names and a short description for each of them.

  1. settoken(token): set your personal token requested from WAQ website

  2. processPMdata(data): process raw data feed. extract information such as PM2.5 AQI, APL, local time, address, and geographical location into a dataframe

  3. getPMbyCityNames(citynames): query PM2.5 level by a city name or a vector of city names. this function will request real-time feed and process the feed by calling processPMdata. Output is organized in a dataframe.

  4. getPMbylatlon(latitude, longitude): similar to getPMbyCityNames but query by latitude and longitude instead of a city name

  5. getPMnearme(): similar to getPMbyCityNames but query by a station closest to you based on computer IP address.

  6. getPMinRegion(): similar to getPMbyCityNames but query by a list of stations in a region. The region can be defined in two ways: a) getPMinRegion(cityname, distance) a city name and a distance range. b) getPMinRegion(geobound = c(lat1, lon1, lat2, lon2)) a rectangular boundary defined by a pair of latitudes and longitudes at opposite corners.

  7. getPMatMajorWorldCities(): similar to getPMbyCityNames but the cities are pre-selected major cities around the world. the city names can be listed using getglobalPM25Options() and set using setworldcities(citynames) in global.R.

  8. getPM_Country(countryname, countrybound): similar to getPMinRegion using google map

  9. getPM_City(cityname, radius): similar to getPMinRegion using google map

  10. plotPMdat(): demo to plot some results

  11. zzz.R: use to suppress no global variable binding note.

  12. global.R: set access to some variables such as token string, WAQ URL string, etc.

There are 2 data sets, which are:

  1. usdat.rda: this contains PM2.5 data collected from all stations at continental USA using getPMinRegion(). the data are collected at 4pm PST, March 19, 2017

  2. worlddat.rda: this contains PM2.5 data collected from a list of large cities worldwide using getPMatMajorWorldCities(). the data are collected at 4pm PST, March 19, 2017

Example

Examples 1 and 2 use local data set. Example 3 would require a personal token and active internet connection to access real-time data feed. Please refer to Data source Section to request a personal token and set your token.

  1. plot the data from worlddat.rda
require(globalPM25)
data("worlddat")
apldesc = getglobalPM25Options()$apl_description
aplcolor = getglobalPM25Options()$apl_color
mapWorld <- ggplot2::borders("world", colour="gray50", fill="gray50") # create a layer of borders
worlddat$APL <- factor(worlddat$APL, levels = apldesc)
g <- ggplot(aes(x = lon, y = lat, colour = APL, size = pm25), data = na.omit(worlddat)) +   mapWorld
g <- g + geom_point() + scale_color_manual(values = aplcolor)
g <- g + ggtitle("PM2.5 data collected from stations at a list of selected cities worldwide; \n the data are sampled at 4pm PST, March 19, 2017")
g
g <- ggplot(aes(x = reorder(city, pm25), y = pm25, fill=APL), data = na.omit(worlddat)) +
    geom_bar(stat = "identity") + scale_fill_manual(values = aplcolor) +
    coord_flip()
g <- g + ggtitle("PM2.5 data collected from stations at a list of selected cities worldwide; \n the data are sampled at 4pm PST, March 19, 2017")
g
  1. plot the data from usdat.rda
data(usdat)
usdat$APL <- factor(usdat$APL, levels = apldesc)
mapUS <- ggplot2::borders("usa", colour="gray50", fill="gray50") # create a layer of borders
g <- ggplot(aes(x = lon, y = lat, colour = APL, size = pm25), data = usdat) +   mapUS
g <- g + geom_point() + scale_color_manual(values = aplcolor)
g <- g + ggtitle("PM2.5 data collected from stations at continental USA; \n the data are sampled at 4pm PST, March 19, 2017
")
g

The followings require a valid token and active internet connection to access server for real-time feed. 3.

getPMnearme() #get real-time PM2.5 level from a station cloest to you (base on your computer ip address)
getPMbyCityNames(c("newyork", "chicago", "san jose", "dallas", "los angeles"))
getPMinRegion("san jose", 100)
getPMatMajorWorldCities()

Conclusion and Future work

First of all, I would like to thank WAQ for making the data available. This package allows users to access real-time air pollute PM2.5 level in various ways: near the user, at any city, at any location, and/or in a region. Basic sorting and plot are implemented. The package was built using R studio with devtools and shared at github.

I plan to continue to maintain the package. Some future work can be: a) change to S6 or reference class, b) add more real-time air pollute parameters such O3, NO2, SO2, PM10 levels. that are available from WAQ server, c) add functions to save the data so that users can retrieve historical data.



jinhuali/globalPM25 documentation built on May 19, 2019, 7:16 a.m.