knitr::opts_chunk$set(echo = TRUE)

NOTE: Because the size of model output files, R code in this vignette is not run when the package is built. This document contains output from commands run on September 2, 2020.

WRF Output Size

The WRF model system is able to forecast a wide variety of meteorological variables across large portions of the Earth's surface. This can result in hefty output files that take up lots of memory. For instance, an AirFire WRF model run for the PNW-4km domain tracks 147 variables for over 100,000 sample points across the Pacific Northwest. Therefore, hourly snapshot files from these runs are nearly 350MB:

rawBytes <- file.size("~/Data/WRF/PNW-4km_2020090112_07.nc")
utils:::format.object_size(rawBytes, "auto", "SI")
[1] "349.6 MB"

Since AirFire usually forecasts about 84 hours ahead, a single model run can take up almost 30GB. The size of this output imposes a heavy IO and memory burden when working with WRF data, so it is important that we try reducing the size of these files as much as possible. One way we can do this is to subset the hourly forecast files to retain only the information we plan to use. We can also reduce memory usage by summarizing and compressing data. So in the case of WRF files, we have several options:

Let's look at the advantages of subsetting with an example.

Subsetting Example

Let's choose a WRF file -- such as hour 7 from the 2020-09-01 12pm run -- and implement the improvements we outlined. To start with, we will retain 21 variables of interest to the AirFire team, crop out regions like Canada and the Pacific Ocean, and rasterize the points onto a 0.06 degree grid:

library(AirFireWRF)
setWRFDataDir("~/Data/WRF")

modelName <- "PNW-4km"
modelRun <- "2020090112"
modelRunHour <- 7

subset <- wrf_load(
  modelName = modelName,
  modelRun = modelRun,
  modelRunHour = modelRunHour,
  vars = c("XLONG", "XLAT", "XLONG_U", "XLAT_U", "XLONG_V", "XLAT_V", "U", "V",
           "U10", "V10", "ZNU", "ZNW", "LU_INDEX", "Q2", "T", "T2", "TH2", 
           "HGT", "RAINNC", "CFRACT", "PBLH"),
  res = 0.06,
  xlim = c(-125, -105),
  ylim = c(39, 49)
)

It takes a few seconds to load, but let's see how much space we saved:

subsetBytes <- object.size(subset)
utils:::format.object_size(subsetBytes, "auto", "SI")
[1] "8 MB"

8 MB is a significant improvement from the original file's 349.6 MB! We can do even better when we save the raster object as a .rda file using xz compression:

fileName <- paste0(modelName, "_", modelRun, "_", stringr::str_pad(modelRunHour, 2, pad = "0"), ".rda")
filePath <- file.path(getWRFDataDir(), fileName)

save(subset, file = filePath, compress = "xz")

fileBytes <- file.size(filePath)
utils:::format.object_size(fileBytes, "auto", "SI")
[1] "1.9 MB"

So from 349.6 MB down to 1.9 MB? Not bad! This size is much more manageable for quickly loading and working with WRF model output, and it can easily be achieved using existing wrf_load() parameters and the base R save() function.



MazamaScience/AirFireWRF documentation built on Nov. 11, 2020, 3:32 a.m.