knitr::opts_chunk$set(echo = TRUE) # Install ohwobpg package installed <- rownames(installed.packages()) if (!("ohwobpg" %in% installed)) remotes::install_github("BigelowLab/ohwobpg", quiet = TRUE)
Ocean Color Processing Group (OPBG) serves satellite data via OPeNDAP. This packages provides simple tools in R language for downloading subsets of global data files, and proposes a simple method for storing and managing the datasets.
This package is deminstrates working with Level 3 (simple global grids) specifically from the AQUA_MODIS instrumentation. The package may be adaptable for other products/instruments, but we haven't tried anything other than AQUA_MODIS Level 3 mapped images.
OBPG managers are migrating from an old-style naming convention to a new-style. Currently, only recently reprocessed data (SST) are served in the new-style. That means this code attemtps to handle either convention seamlessly while navigating the system. Eventually, all products will be served in the new-style filenaming format, so we have kept that in mind when proposing a local storage and management system.
OBPG data naturally organize under a simple heirarchy <root>/region/yyyy/mmdd/files
. We find that allowing the end user to specify the <root>/region
while autmatically enforcing the remainder yyyy/mmdd/files
works really well. For example, suppose you are going to download daily SST and CHLOR_A from AQUA_MODIS covering the Gulf of Maine in 2018. We suggest that you create the root path like the following shows - a simple directory in you home directory (but whatever works for you works for us.)
library(ohwobpg) path <- "~/gom" dir.create(path, recursive = TRUE)
Any data you subsequently download using this package will automatically create subdirectories that are required. Below is an example from our own lab where <root>/region
is /mnt/ecocast/coredata/obpg2/nwa/AQUA_MODIS/L3m
and the automatically generated subdirectories, yyyy/mmdd
, are 2018/0101
.
/mnt/ecocast/coredata/obpg2/nwa/AQUA_MODIS/L3m/2018/0101 ├── AQUA_MODIS.20180101.L3m.16DR.CHL.chlor_a.4km.tif ├── AQUA_MODIS.20180101.L3m.32DR.CHL.chlor_a.4km.tif ├── AQUA_MODIS.20180101.L3m.8DR.CHL.chlor_a.4km.tif ├── AQUA_MODIS.20180101.L3m.8DR.PAR.par.4km.tif ├── AQUA_MODIS.20180101.L3m.8DR.PIC.pic.4km.tif ├── AQUA_MODIS.20180101.L3m.8DR.POC.poc.4km.tif ├── AQUA_MODIS.20180101.L3m.8DR.SST.sst.4km.tif ├── AQUA_MODIS.20180101.L3m.8DR.SST.sst_slope.4km.tif ├── AQUA_MODIS.20180101.L3m.DAY.CHL.chlor_a.4km.tif ├── AQUA_MODIS.20180101.L3m.DAY.CHL.chlor_a_cum.4km.tif ├── AQUA_MODIS.20180101.L3m.DAY.CHL.chlor_a_fill.4km.tif ├── AQUA_MODIS.20180101.L3m.DAY.PAR.par.4km.tif ├── AQUA_MODIS.20180101.L3m.DAY.PIC.pic.4km.tif ├── AQUA_MODIS.20180101.L3m.DAY.POC.poc.4km.tif └── AQUA_MODIS.20180101.L3m.DAY.SST.sst.4km.tif
Let's download 2018 monthly CHLOR_A data at 9km resoltion just for the Gulf of Maine region. First we build a series of URLs for the data using obpg_build_url()
. The function has a number of arguments, but we'll just focus on what we need and accept the default values for the others. Complete documentation is available by typing at the console, ?obpg_build_url
.
library(ohwobpg) # and define our bounding box [west, east, south, north] BB <- c(-72, -63, 39, 46) # we need a sequence of dates dates <- seq( from = as.Date("2018-01-01"), to = as.Date("2018-12-01"), by = "month") # then we build the URLs urls <- obpg_build_url( dates = dates, param = "chlor_a", suite = "CHL", period = "MO", res = "9km") head(urls)
Now we'll open just the first NCDF resource. From that we'll build a simple list of items we need to successfully navigate the remainder of the URLs. Then we can close the NCDF resource.
nc1 <- obpg_open(urls[1]) nav <- obpg_nc_nav(nc1, bb = BB, res = obpg_res(what = "9km"), varname = "chlor_a") obpg_close(nc1)
Now we simply need to iterate through the dates - downloading the subset data and storing in our path.
for (this_url in urls){ cat("fetching", basename(this_url), "\n") new_data <- obpg_fetch(this_url, nav, outpath = path) }
Note We have downloaded a larger dataset of sst, par and chlor_a for the Gulf of Maine which we will work with later. The script we used for downloading can be found here.
It is easy to create a database by first creating a list of files, then parsing to the database format. We actually have that list of files in hand already in our URLs, but for the sake of example, let's do a listing by file search instead. Note that we use the pipe operator %>%
, provided to us by the dplyr package, to pipe the output of one function to the the input of the next. There are boat loads of tutorials on using dplyr available to you.
library(dplyr, warn.conflicts = FALSE) db <- list.files(path, pattern = glob2rx("*.tif"), full.names = TRUE, recursive = TRUE) %>% as_database() %>% write_database(path) db
The database is a very simple table (data frame) build from various elements of a filename. All of the parts could be compute as-needed which would make the file smaller to store on disk, but the ease of parsing and saving is worth the extra bit of disk required. The OBPG filenames have all of the necessary information to uniquely identify each file - details can found in the documentation ?as_database
. For now, let's just print it out and look at it.
Note The nrt column refers to "near real time" data. OBPG group first publishes it data flagged as "nrt". Some time later (weeks? months?), after quality review and adjustments, the data is republished without the "nrt" flag. For this tutorial we'll ignore it, but one could use that to identify local files suitable for updating when OBPG updates.
Use the read_database(path)
and write_database(db, path)
functions for input and output.
We provide with the ohwobpg package a slightly larger and more complex dataset. This will save the need for each participant to download from the OBPG servers. The larger dataset includes ...
monthly sst, chlor_a and par data from 2018 in the Gulf of Maine
daily sst data from August 2018 in the Gulf of Maine
The path must now be redefined, and then we can read in the new associated database.
path <- system.file("gom", package = "ohwobpg") db <- read_database(path)
We can do a quick summary by counting the records by period and parameter.
db %>% # start with the database dplyr::count(per, param) # count instance first by period then by parameter
Note almost every function in R comes from a package - it can be hard to remember where each comes from. To help jog one's memory it can be helpful to prepend the package name to the function - for instance, instead of writing
count(...)
note that we wrotedplyr::count(...)
. In this case, there is no difference between the two other than it is easy to recall to which packagecount()
belongs.
The database can be easily filtered to chose just the images needed; for this task we continue leveraging the tools in the dplyr package. Let's grab par monthly data between May and September of 2018. First we filter the database to a smaller subset, then convert it to a set of filenames, and finally load it into a raster stack.
library(raster) par_db <- db %>% dplyr::filter(param == "sst" & per == "MO" & dplyr::between(date, as.Date("2018-05-15"), as.Date("2018-09-26"))) par_db
Using the filtered database we then read in a subset of records into a raster stack of images. By default each layer's name is assigned the filename from which it came, but that can make for really names. We know that each layer is one month, so we will assign each a new name: "Jun", "Jul", "Aug", "Sep". You can lean more about formatting dates here ?strftime
. The are many raster tutorials available and a handy cheatsheet.
par <- par_db %>% # start with the subset database as_filename(path = path) %>% # build filenames and append to the path raster::stack() # read them into a stack of images names(par) <- format(par_db$date, "%b") par
There are lots of ways to draw a raster. We show three simple ones in a brief tutorial.
Note If you clone the package to your lcoal computer you can easily view these tutorials from within an RStudio session.
Extracting from a stack at a point, a patch of points, or a polygon is very staright forward. See this tutorial for an example.
Creating a derived stack is easy with raster math - see this page for an example.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.