knitr::opts_chunk$set(echo = TRUE)
Let's start exploring AOD data with the MazamaSatelliteUtils package. First we'll give ourselves a case study to focus on, the recent Milepost 97 Fire. This incident was reported on July 24, 2019 about 1 mile southeast of Canyonville, Oregon and was still current as of August 14, 2019.
To begin our analysis, let's load the MazamaSatelliteUtils package and define
a SatelliteDataDir
: A directory where all of our NetCDF files will be stored
to and read from:
library(MazamaSatelliteUtils) setSatelliteDataDir("~/Data/Satellite")
We can now start loading in satellite data using the goesaodc_downloadAOD()
function. For this we have to provide a datetime to specify which files we
want. If the datetime is defined to the hour, then all of the NetCDF files
generated in that hour will be downloaded. If just the date is given with no
time specified, then all files generated during that entire day will be fetched.
In either case, any downloaded files will be stored in the SatelliteDataDir
we
just defined.
Since downloading many NetCDF files can take a bit of time, for now let's only gather data for August 1, 2019 at 17:00:00 UTC (10AM PDT).
(Note: This example uses data from GOES-East because no data is available prior to August 28, 2019 from GOES-West.)
datetime <- lubridate::ymd_h("2019-08-01 17", tz = "UTC") downloadedFiles <- goesaodc_downloadAOD("G16", datetime)
The goesaodc_downloadAOD()
function also returns a vector of paths to each
downloaded file. However, if a file already exists in the SatelliteDataDir
directory then it will not be downloaded again and will not show up in the
returned vector. We can see which files are now availible to read by calling
goesaodc_listFiles()
on the datetime we specified:
dateFiles <- goesaodc_listFiles("G16", datetime) print(dateFiles)
In order to read the contents of a NetCDF file we will need a handle to access
it by. This can be done by calling goesaodc_openFile()
on the dataset
filename, so let's open the first one in our downloaded time range
dateFiles[1]
:
nc <- goesaodc_openFile(filename = dateFiles[1])
Now that we have a handle to a netCDF file we can freely access readings from
any of the variables in the dataset (listed by names(nc$var)
) using the
ncvar_get()
function from the ncdf4
package.
readings_aod <- as.numeric(ncdf4::ncvar_get(nc, "AOD")) print(length(readings_aod)) print(range(readings_aod, na.rm = TRUE))
For our purposes of mapping AOD values, we can consolidate all the data we need (longitude, latitude, and AOD) and make it even easier to access by forming it into a tibble. The "DQF" variable will be included as well, which represents the data quality flag for each reading. This flag can take one of the four following values:
tb <- goesaodc_createTibble(nc) knitr::kable(head(tb))
We can now finally start visualizing this data on geographic maps by creating and plotting spatial points from the AOD readings. Each spatial point represents a single reading from the ABI's scan, creating a dense grid of measurements taken about 4km from each other. Let's plot all of the points for our datetime now on a world map:
sp <- goesaodc_createSpatialPoints(nc, dqfLevel = 2) maps::map("world") goesaodc_plotSpatialPoints(sp, add = TRUE, cex = 0.1)
As you can see, only the United States and Mexico have readings defined. The GOES-16 satellite can't scan beyond the disk visible below its geosynchrous orbit, so let's zoom in to just the U.S.:
maps::map(database = "state") goesaodc_plotSpatialPoints(sp, add = TRUE, cex = 0.3)
Now that we have a clearer image of the points we can start to understand the
AOD values themselves. The goesaodc_plotSpatialPoints()
function colors
points according to their AOD value by default, but you can switch to DQF
coloring by setting the var
parameter to "DQF". Additionally, when creating
your spatial points you can also define the dqfLevel
which subsets all points
that are at and below the given level. Let's look at the points at level 0
(highest quality):
sp_dqf0 <- goesaodc_createSpatialPoints(nc, dqfLevel = 0) maps::map("state") goesaodc_plotSpatialPoints(sp_dqf0, cex = 0.3, add = TRUE)
As you can see, quite a bit of data has been removed since a good portion was
marked as only medium or low quality. At this point let's zoom in all the way
to Oregon to check for signs of the Milepost 97 Fire (including all quality
levels). We'll need to use the MazamaSpatialUtils
package to get the spatial
data for the state and cull out-of-bounds points:
library(MazamaSpatialUtils) setSpatialDataDir("~/Data/Spatial") loadSpatialData("USCensusStates") oregon <- subset(USCensusStates, stateCode == "OR") bbox_oregon <- sp::bbox(oregon) sp_oregon <- goesaodc_createSpatialPoints(nc, bbox = bbox_oregon, dqfLevel = 2) lon_mp97 <- -123.268 lat_mp97 <- 42.913 plot(oregon) goesaodc_plotSpatialPoints(sp_oregon, cex = 0.3, add = TRUE) points(x = lon_mp97, y = lat_mp97, pch = 0, cex = 4.0, lwd = 1.5, col = "darkgreen")
Now we can start to make out the the scan pattern made by the ABI, which forms a slightly tilted grid of points.
A raster image divides geographic space into uniform pixels or "cells". A grid is placed over a map and the value of each cell is calculated by aggregating the those of all the spatial points that fall in that region. For example, if 20 points are contained within one grid space, then all their AOD values could be averaged together to determine the resulting AOD of the raster cell. The uniform grid size is defined by the resolution of the raster (in lon/lat degrees) such that higher resolutions reduce cell area and increase the detail and accuracy of the final image.
Rasters provide the benefit of filling in undefined spaces around spatial points while also making it easier to track locations that the ABI might not have made spatial points for.
rstr <- goesaodc_createRaster(nc, bbox = bbox_oregon, res = 0.1, dqfLevel = 2) pal_aod <- colorRampPalette(c("lightgoldenrod1", "red3")) pal_count <- colorRampPalette(c("powderblue", "dodgerblue4")) raster::plot(rstr$AOD, col = pal_aod(50)) plot(oregon, add = TRUE) points(x = lon_mp97, y = lat_mp97, pch = 0, cex = 4.0, lwd = 1.5, col = "darkgreen")
The resolution of the raster is defined by coordinate degrees, so that by
setting res
to 0.1 every cell will be 0.1 longitudial degrees "wide" and 0.1
latitudial degrees "tall". However, cell dimensions in meters are not consistent
due to how longitude and latitude lines are curved around the Earth. We can
approximate the average cell size though by finding the average area of each and
then taking the square root of the slightly rectangular regions. The
raster::area()
function estimates the area of each cell in km²:
avgCellArea <- mean(raster::values(raster::area(rstr)))
Just for a sanity check, we can make a rough guess of the size of Oregon
state using the average cell area of r avgCellArea
km² and the number of
cells in the bounding box (r dim(rstr)[1] * dim(rstr)[2]
). This yields
r avgCellArea * dim(rstr)[1] * dim(rstr)[2]
km², which isn't to far off from
the offical state area of 255,030 km².
The square root of the mean cell area is r sqrt(avgCellArea)
. So for a raster
of Oregon with a resolution of 0.1, each cell is about r sqrt(avgCellArea)
km
on each side.
By zooming in even closer to MilePost 97 we can actually see how these cells are defined by the spatial points they contain. Here, cells are colored by the number of points that fall within them:
bbox_mp97 <- c(-124, -123, 42.5, 43.5) pointCoords_mb97 <- dplyr::filter(tb, lon > bbox_mp97[1], lon < bbox_mp97[2], lat > bbox_mp97[3], lat < bbox_mp97[4]) rstrCount <- goesaodc_createRaster(nc, bbox = bbox_oregon, res = 0.1, fun = "count", dqfLevel = 2) raster::plot(rstrCount$AOD, main = "Cell Point Count", col = pal_count(50), xlim = bbox_mp97[1:2], ylim = bbox_mp97[3:4]) plot(oregon, add = TRUE) points(pointCoords_mb97$lon, pointCoords_mb97$lat, pch = 16, cex = 0.3) points(lon_mp97, lat_mp97, pch = 3, cex = 1, col = "red")
Given that the average distance between spatial points is about 4km, we can confirm our cell size estimation since these appear to be 2 points wide (8km) and about 3 tall (12km).
So far we have only looked at a single instant from our original datetime. There are still 11 other datasets from 16:00:00 that we can analyze like we did this one, but it's also possible to start combining data into either aggregated rasters or time series plots. In both cases we first create a stack of rasters by, again, defining an hour or date of interest. Let's stay with our current hour:
rstrStack <- goesaodc_createRasterStack( satID = "G16", datetime = datetime, bbox = bbox_oregon, dqfLevel = 3, res = 0.08)
This raster stack now contains individual layers for each downloaded dataset. We
can visualize them all at once using the rasterVis
package:
rasterVis::levelplot(rstrStack)
Additionally, one can aggregate all of the rasters together to make one summarizing image, such as finding the mean AOD value for every pixel between scans.
rstrStackAvg <- raster::mean(rstrStack, na.rm = TRUE) raster::plot(rstrStackAvg, col = pal_aod(50)) plot(oregon, add = TRUE) points(x = lon_mp97, y = lat_mp97, pch = 0, cex = 4.0, lwd = 1.5, col = "darkgreen")
Finally, the raster_createLocationTimeseries()
function allows us to track
AOD values of a specific location in each layer of a stack, resulting in a
tibble of AOD values and their timestamps. There are a couple of ways of
extracting this location data, however. The default method is to track the one
cell that the given coordinates fall into, chosen by setting the method
parameter to "simple". Additionally, one can set method
to "bilinear", and
track the AOD interpolated between the 4 closest cells.
datetimeLocal <- datetime attributes(datetimeLocal)$tzone = "America/Los_Angeles" ts_simple <- raster_createLocationTimeseries(rasterStack = rstrStack, longitude = lon_mp97, latitude = lat_mp97, bbox = bbox_oregon, method = "simple") par(mfrow=c(1, 2)) raster::plot(rstrStackAvg, col = pal_aod(50), xlim = c(-125, -122), ylim = c(42, 44), main = "Milepost 97 AOD", xlab = "Longitude", ylab = "Latitude") plot(oregon, add = TRUE) points(x = lon_mp97, y = lat_mp97, pch = 3, lwd = 2) plot(x = ts_simple$datetime, y = ts_simple$aod, pch = 16, cex = 1, main = paste(datetimeLocal, "PDT"), xlab = "Time (local)", ylab = "AOD")
Aside from the two method
options, we can instead define a buffer
, which is
a circular area around our location that will aggregate the values of cells that
fall within it. Provide the buffer
radius in meters and a fun
aggregation
method such as mean, min, or max. fun
is set to mean by default.
ts_buffer <- raster_createLocationTimeseries(rasterStack = rstrStack, longitude = lon_mp97, latitude = lat_mp97, bbox = bbox_oregon, buffer = 20000, fun = mean) par(mfrow=c(1, 2)) raster::plot(rstrStackAvg, col = pal_aod(50), xlim = c(-125, -122), ylim = c(42, 44), main = "Milepost 97 AOD", xlab = "Longitude", ylab = "Latitude") plot(oregon, add = TRUE) points(x = lon_mp97, y = lat_mp97, pch = 1, lwd = 0.5, cex = 5) points(x = lon_mp97, y = lat_mp97, pch = 3, lwd = 2) plot(x = ts_buffer$datetime, y = ts_buffer$aod, pch = 16, cex = 1, main = paste(datetimeLocal, "PDT"), xlab = "Time (local)", ylab = "AOD")
A time series plot with only 12 points isn't especially useful, though it can help show the trend of the AOD values over a portion of a day. We are currently building a function that can create a raster stack for every daylight hour for a given date, which has provided some very interesting insight so far!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.