In MazamaScience/MazamaSatelliteUtils: Utilities for Working with Model and Satellite Data as Rasters

knitr::opts_chunk$set(echo = TRUE)

MazamaSatelliteUtils Setup

Let's start exploring AOD data with the MazamaSatelliteUtils package. First we'll give ourselves a case study to focus on, the recent Milepost 97 Fire. This incident was reported on July 24, 2019 about 1 mile southeast of Canyonville, Oregon and was still current as of August 14, 2019.

To begin our analysis, let's load the MazamaSatelliteUtils package and define a SatelliteDataDir: A directory where all of our NetCDF files will be stored to and read from:

library(MazamaSatelliteUtils)
setSatelliteDataDir("~/Data/Satellite")

Downloading Data and Creating NetCDF Handles

We can now start loading in satellite data using the goesaodc_downloadAOD() function. For this we have to provide a datetime to specify which files we want. If the datetime is defined to the hour, then all of the NetCDF files generated in that hour will be downloaded. If just the date is given with no time specified, then all files generated during that entire day will be fetched. In either case, any downloaded files will be stored in the SatelliteDataDir we just defined.

Since downloading many NetCDF files can take a bit of time, for now let's only gather data for August 1, 2019 at 17:00:00 UTC (10AM PDT).

(Note: This example uses data from GOES-East because no data is available prior to August 28, 2019 from GOES-West.)

datetime <- lubridate::ymd_h("2019-08-01 17", tz = "UTC")
downloadedFiles <- goesaodc_downloadAOD("G16", datetime)

The goesaodc_downloadAOD() function also returns a vector of paths to each downloaded file. However, if a file already exists in the SatelliteDataDir directory then it will not be downloaded again and will not show up in the returned vector. We can see which files are now availible to read by calling goesaodc_listFiles() on the datetime we specified:

dateFiles <- goesaodc_listFiles("G16", datetime)
print(dateFiles)

In order to read the contents of a NetCDF file we will need a handle to access it by. This can be done by calling goesaodc_openFile() on the dataset filename, so let's open the first one in our downloaded time range dateFiles[1]:

nc <- goesaodc_openFile(filename = dateFiles[1])

Reading Through NetCDF Handles

Now that we have a handle to a netCDF file we can freely access readings from any of the variables in the dataset (listed by names(nc$var)) using the ncvar_get() function from the ncdf4 package.

readings_aod <- as.numeric(ncdf4::ncvar_get(nc, "AOD"))
print(length(readings_aod))
print(range(readings_aod, na.rm = TRUE))

AOD and DQF Data as a Tibble

For our purposes of mapping AOD values, we can consolidate all the data we need (longitude, latitude, and AOD) and make it even easier to access by forming it into a tibble. The "DQF" variable will be included as well, which represents the data quality flag for each reading. This flag can take one of the four following values:

0 - High quality
1 - Medium quality
2 - Low quality
3 - No retrieval flag source p.295:

tb <- goesaodc_createTibble(nc)
knitr::kable(head(tb))

Working With Data as Spatial Points

We can now finally start visualizing this data on geographic maps by creating and plotting spatial points from the AOD readings. Each spatial point represents a single reading from the ABI's scan, creating a dense grid of measurements taken about 4km from each other. Let's plot all of the points for our datetime now on a world map:

sp <- goesaodc_createSpatialPoints(nc, dqfLevel = 2)

maps::map("world")
goesaodc_plotSpatialPoints(sp, add = TRUE, cex = 0.1)

As you can see, only the United States and Mexico have readings defined. The GOES-16 satellite can't scan beyond the disk visible below its geosynchrous orbit, so let's zoom in to just the U.S.:

maps::map(database = "state")
goesaodc_plotSpatialPoints(sp, add = TRUE, cex = 0.3)

Now that we have a clearer image of the points we can start to understand the AOD values themselves. The goesaodc_plotSpatialPoints() function colors points according to their AOD value by default, but you can switch to DQF coloring by setting the var parameter to "DQF". Additionally, when creating your spatial points you can also define the dqfLevel which subsets all points that are at and below the given level. Let's look at the points at level 0 (highest quality):

sp_dqf0 <- goesaodc_createSpatialPoints(nc, dqfLevel = 0)

maps::map("state")
goesaodc_plotSpatialPoints(sp_dqf0, cex = 0.3, add = TRUE)

As you can see, quite a bit of data has been removed since a good portion was marked as only medium or low quality. At this point let's zoom in all the way to Oregon to check for signs of the Milepost 97 Fire (including all quality levels). We'll need to use the MazamaSpatialUtils package to get the spatial data for the state and cull out-of-bounds points:

library(MazamaSpatialUtils)
setSpatialDataDir("~/Data/Spatial")
loadSpatialData("USCensusStates")

oregon <- subset(USCensusStates, stateCode == "OR")
bbox_oregon <- sp::bbox(oregon)
sp_oregon <- goesaodc_createSpatialPoints(nc, bbox = bbox_oregon, dqfLevel = 2)

lon_mp97 <- -123.268
lat_mp97 <- 42.913

plot(oregon)
goesaodc_plotSpatialPoints(sp_oregon, cex = 0.3, add = TRUE)
points(x = lon_mp97, y = lat_mp97, pch = 0, cex = 4.0, lwd = 1.5, 
       col = "darkgreen")

Now we can start to make out the the scan pattern made by the ABI, which forms a slightly tilted grid of points.

Rasterizing Spatial Points

A raster image divides geographic space into uniform pixels or "cells". A grid is placed over a map and the value of each cell is calculated by aggregating the those of all the spatial points that fall in that region. For example, if 20 points are contained within one grid space, then all their AOD values could be averaged together to determine the resulting AOD of the raster cell. The uniform grid size is defined by the resolution of the raster (in lon/lat degrees) such that higher resolutions reduce cell area and increase the detail and accuracy of the final image.

Rasters provide the benefit of filling in undefined spaces around spatial points while also making it easier to track locations that the ABI might not have made spatial points for.

rstr <- goesaodc_createRaster(nc, bbox = bbox_oregon, res = 0.1, dqfLevel = 2)
pal_aod <- colorRampPalette(c("lightgoldenrod1", "red3"))
pal_count <- colorRampPalette(c("powderblue", "dodgerblue4"))

raster::plot(rstr$AOD, col = pal_aod(50))
plot(oregon, add = TRUE)
points(x = lon_mp97, y = lat_mp97, pch = 0, cex = 4.0, lwd = 1.5, 
       col = "darkgreen")

The resolution of the raster is defined by coordinate degrees, so that by setting res to 0.1 every cell will be 0.1 longitudial degrees "wide" and 0.1 latitudial degrees "tall". However, cell dimensions in meters are not consistent due to how longitude and latitude lines are curved around the Earth. We can approximate the average cell size though by finding the average area of each and then taking the square root of the slightly rectangular regions. The raster::area() function estimates the area of each cell in km²:

avgCellArea <- mean(raster::values(raster::area(rstr)))

Just for a sanity check, we can make a rough guess of the size of Oregon state using the average cell area of r avgCellArea km² and the number of cells in the bounding box (r dim(rstr)[1] * dim(rstr)[2]). This yields r avgCellArea * dim(rstr)[1] * dim(rstr)[2] km², which isn't to far off from the offical state area of 255,030 km².

The square root of the mean cell area is r sqrt(avgCellArea). So for a raster of Oregon with a resolution of 0.1, each cell is about r sqrt(avgCellArea) km on each side.

By zooming in even closer to MilePost 97 we can actually see how these cells are defined by the spatial points they contain. Here, cells are colored by the number of points that fall within them:

bbox_mp97 <- c(-124, -123, 42.5, 43.5)
pointCoords_mb97 <- dplyr::filter(tb, 
                                  lon > bbox_mp97[1], lon < bbox_mp97[2], 
                                  lat > bbox_mp97[3], lat < bbox_mp97[4])

rstrCount <- goesaodc_createRaster(nc, bbox = bbox_oregon, res = 0.1, 
                                   fun = "count", dqfLevel = 2)
raster::plot(rstrCount$AOD, main = "Cell Point Count", col = pal_count(50),
             xlim = bbox_mp97[1:2], ylim = bbox_mp97[3:4])
plot(oregon, add = TRUE)
points(pointCoords_mb97$lon, pointCoords_mb97$lat, pch = 16, cex = 0.3)
points(lon_mp97, lat_mp97, pch = 3, cex = 1, col = "red")

Given that the average distance between spatial points is about 4km, we can confirm our cell size estimation since these appear to be 2 points wide (8km) and about 3 tall (12km).

AOD Over Time

Raster Stacks

So far we have only looked at a single instant from our original datetime. There are still 11 other datasets from 16:00:00 that we can analyze like we did this one, but it's also possible to start combining data into either aggregated rasters or time series plots. In both cases we first create a stack of rasters by, again, defining an hour or date of interest. Let's stay with our current hour:

rstrStack <- goesaodc_createRasterStack(
  satID = "G16", 
  datetime = datetime,
  bbox = bbox_oregon, 
  dqfLevel = 3,
  res = 0.08)

This raster stack now contains individual layers for each downloaded dataset. We can visualize them all at once using the rasterVis package:

rasterVis::levelplot(rstrStack)

Additionally, one can aggregate all of the rasters together to make one summarizing image, such as finding the mean AOD value for every pixel between scans.

rstrStackAvg <- raster::mean(rstrStack, na.rm = TRUE)
raster::plot(rstrStackAvg, col = pal_aod(50))
plot(oregon, add = TRUE)
points(x = lon_mp97, y = lat_mp97, pch = 0, cex = 4.0, lwd = 1.5, 
       col = "darkgreen")

Hourly Time Series Plots

Finally, the raster_createLocationTimeseries() function allows us to track AOD values of a specific location in each layer of a stack, resulting in a tibble of AOD values and their timestamps. There are a couple of ways of extracting this location data, however. The default method is to track the one cell that the given coordinates fall into, chosen by setting the method parameter to "simple". Additionally, one can set method to "bilinear", and track the AOD interpolated between the 4 closest cells.

datetimeLocal <- datetime
attributes(datetimeLocal)$tzone = "America/Los_Angeles"

ts_simple <- raster_createLocationTimeseries(rasterStack = rstrStack,
                                             longitude = lon_mp97, 
                                             latitude = lat_mp97, 
                                             bbox = bbox_oregon,
                                             method = "simple")

par(mfrow=c(1, 2))
raster::plot(rstrStackAvg, col = pal_aod(50), 
             xlim = c(-125, -122), ylim = c(42, 44),
             main = "Milepost 97 AOD", 
             xlab = "Longitude", ylab = "Latitude")
plot(oregon, add = TRUE)
points(x = lon_mp97, y = lat_mp97, pch = 3, lwd = 2)

plot(x = ts_simple$datetime, y = ts_simple$aod, pch = 16, cex = 1,
     main = paste(datetimeLocal, "PDT"), xlab = "Time (local)", ylab = "AOD")

Aside from the two method options, we can instead define a buffer, which is a circular area around our location that will aggregate the values of cells that fall within it. Provide the bufferradius in meters and a fun aggregation method such as mean, min, or max. fun is set to mean by default.

ts_buffer <- raster_createLocationTimeseries(rasterStack = rstrStack,
                                             longitude = lon_mp97,
                                             latitude = lat_mp97, 
                                             bbox = bbox_oregon,
                                             buffer = 20000,
                                             fun = mean)

par(mfrow=c(1, 2))
raster::plot(rstrStackAvg, col = pal_aod(50), 
             xlim = c(-125, -122), ylim = c(42, 44),
             main = "Milepost 97 AOD", 
             xlab = "Longitude", ylab = "Latitude")
plot(oregon, add = TRUE)
points(x = lon_mp97, y = lat_mp97, pch = 1, lwd = 0.5, cex = 5)
points(x = lon_mp97, y = lat_mp97, pch = 3, lwd = 2)

plot(x = ts_buffer$datetime, y = ts_buffer$aod, pch = 16, cex = 1,
     main = paste(datetimeLocal, "PDT"), xlab = "Time (local)", ylab = "AOD")

Daily Time Series Plots

A time series plot with only 12 points isn't especially useful, though it can help show the trend of the AOD values over a portion of a day. We are currently building a function that can create a raster stack for every daylight hour for a given date, which has provided some very interesting insight so far!

MazamaScience/MazamaSatelliteUtils documentation built on Dec. 17, 2021, 3:20 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

MazamaScience/MazamaSatelliteUtils
Utilities for Working with Model and Satellite Data as Rasters

In MazamaScience/MazamaSatelliteUtils: Utilities for Working with Model and Satellite Data as Rasters

MazamaSatelliteUtils Setup

Downloading Data and Creating NetCDF Handles

Reading Through NetCDF Handles

AOD and DQF Data as a Tibble

Working With Data as Spatial Points

Rasterizing Spatial Points

AOD Over Time

Raster Stacks

Hourly Time Series Plots

Daily Time Series Plots

R Package Documentation

Browse R Packages

We want your feedback!

MazamaScience/MazamaSatelliteUtils Utilities for Working with Model and Satellite Data as Rasters

In MazamaScience/MazamaSatelliteUtils: Utilities for Working with Model and Satellite Data as Rasters

MazamaSatelliteUtils Setup

Downloading Data and Creating NetCDF Handles

Reading Through NetCDF Handles

AOD and DQF Data as a Tibble

Working With Data as Spatial Points

Rasterizing Spatial Points

AOD Over Time

Raster Stacks

Hourly Time Series Plots

Daily Time Series Plots

R Package Documentation

Browse R Packages

We want your feedback!

MazamaScience/MazamaSatelliteUtils
Utilities for Working with Model and Satellite Data as Rasters