This short document is intended to introduce some workflows for running and analyzing multiple RavenR models in support of model ensembles. If you have not yet completed the Introduction to RavenR vignette or used RavenR before, it is recommended that you complete that vignette before this one.
This workflow will use some additional libraries that are not required for using RavenR as a standalone package. Please install the packages listed below if they are not already installed, and load the libraries into your session.
library(RavenR) library(plyr) library(dplyr) library(tidyr) library(ggplot2) library(lubridate) library(xts) library(dygraphs) library(htmlwidgets) knitr::opts_chunk$set(fig.width=10,fig.height=5) #default figure height
We begin by downloading the model files and spatial data for the Liard, which is located in the Northwest Territories, and is discussed further in the paper by Brown and Craig, 2020. The block below downloads the zip file to the C drive in a TEMP folder, but this may be adjusted to the folder location of your preference.
# download the Liard Raven model from http://raven.uwaterloo.ca/Downloads.html outdir <- "C:/TEMP/" download.file(url="http://raven.uwaterloo.ca/files/LiardRiverModel.zip", destfile=paste0(outdir,"LiardRiverModel.zip")) unzip(paste0(outdir,"LiardRiverModel.zip"), exdir=paste0(outdir,"LiardRiverModel")) # set working directory in new unzipped folder setwd(paste0(outdir,"LiardRiverModel"))
Now, we setup our Raven executable for use within R by declaring our path to the model files and Raven executable, and generate a Raven execution command. Following the general example in the Basic HTML Raven Workflow, we define a simple function that will run a Raven model, with inputs specific to the Liard model as the function defaults. In this case, we provide a path to the latest Raven executable in the C drive, which can be used to run models in other locations. We also provide an opportunity for additional commands to be provided to Raven with the addtlcmds
parameter.
Note that on operating different systems, this syntax and file paths may need to be slightly adjusted.
runRaven <- function(fileprefix="Liard", ravenfile="C:/Raven/exe/Raven.exe", runtag="", indir=getwd(), outdir=paste0(getwd(),"/output/"), addtlcmds="", showoutput=FALSE) { RavenCMD <-paste(ravenfile," ",indir,"/",fileprefix," -o ",outdir,sep=""); invisible(system(RavenCMD, show.output.on.console = showoutput)); }
In order to facilitate a fast run of the Liard model, open the Liard.rvi file and change the duration to 1095 instead of 7305 - we will run each Liard model for just three years instead of 20.
Once you have done that, try out the runRaven
command, and see that it functions as expected. Note that the defined function does not show the Raven output by default (with showoutput=FALSE
), but if you wish to see the Raven output in R, simply change this parameter to TRUE
.
runRaven(showoutput = TRUE)
Create a base model run output folder called 'output_base' by passing this as a parameter value to outdir.
runRaven(outdir=paste0(getwd(),"/output_base/"))
We will create a workflow to generate a simple parametric model ensemble of the Liard model, which samples the value of the parameter OW_PET_CORR.
Question: Check the Raven manual for a description of the parameter OW_PET_CORR. How would you expect the parameter to impact the models?
To do this, start by undertaking the following steps (either manually or using the R script below).
The Liard.rvp.tpl file will serve as a template file, in which the placeholder value in the template file (par_x01) can be overwritten with a provided parameter value to be used in simulation. Note that in this particular case the file only has one value of 0.8, thus we can use the string substituion directly, but for other parameters or values we may need to define a more clever string replacecment (or perform this operation manually).
gsub(pattern="0.8,", replacement="par_x01,", x=readLines("Liard.rvp")) %>% writeLines(., "Liard.rvp.tpl")
Now, we will run our Raven model N times with uniformly sampled random values for the parameter OW_PET_CORR
, and store the outputs in separate folders. In the workflow below, the gsub
function is used to substitute the par_x01
string within the template file with a provided parameter value; the updated file is then provided as a regular rvp file for Raven.
To keep our folder tidy, we begin by creating a parent folder to store the output folders of each run.
# setup the parametric model run if (!(dir.exists("output_parametric"))) { dir.create("output_parametric") } N <- 20 # number of runs to generate min_OW_PET_CORR <- 0 # minimum parameter value max_OW_PET_CORR <- 3 # maximum parameter value # determine random values to use set.seed(20201114) values_OW_PET_CORR <- runif(n=N, min=min_OW_PET_CORR, max=max_OW_PET_CORR )
# run Raven N times to generate model ensembles for (i in 1:N) { # create a new Liard.rvp file with random parameter value gsub(pattern="par_x01", replacement=sprintf("%.4f", values_OW_PET_CORR[i]), x=readLines("Liard.rvp.tpl")) %>% writeLines(., "Liard.rvp") # build output path outpath <- paste0(getwd(),sprintf("/output_parametric/output_%02d",i)) # run Raven model and store output runRaven(outdir=outpath) }
With the model runs created, we now read in the hydrographs, and plot them as a distribution of model results. To begin, we read in the hydrographs from each of the N simulations. The Liard model is a distributed model, but for our purposes here, the hydrographs at the model outlet at subbasin 63 are collected.
nrow_timesteps <- 1096 # three years of simulation sub_to_extract <- "SUB_63" # subbasin outlet # setup data structure for hydraographs hyd_df <- data.frame(matrix(NA,nrow=nrow_timesteps,ncol=(N+1))) colnames(hyd_df) <- c("Date",sprintf("hyd_%02d",seq(1,N))) for (i in 1:N) { #read in the hydrograph of sub_63 for all of the ensemble runs, store in the hyd_df dataframe myhyd <- rvn_hyd_read(ff=sprintf("./output_parametric/output_%02d/Hydrographs.csv",i)) hyd_df[,(i+1)] <- as.numeric(myhyd$hyd[,sub_to_extract]) } # collect date column hyd_df[,1] <- rvn_fortify_xts(myhyd$hyd)$Date # convert the dataframe to an xts (extended time series) data format hyd_df_xts <- xts(hyd_df[,2:(N+1)], order.by=hyd_df$Date)
We can now plot the hydrographs of all N simulations.
# create a 'long' rather than wide tibble hyd_df_longer <- hyd_df %>% pivot_longer(cols=colnames(hyd_df)[-1], names_to="hyd_no", values_to="flow") # plot hydrographs with ggplot2 ggplot(hyd_df_longer, aes(x=Date, y=flow))+ geom_line(alpha=0.5, color='blue')
This plot can also be provided as a dygraph for easier viewing of results in an extended simulation period of three years.
dygraph(hyd_df_xts) %>% htmltools::tagList()
We can also plot the hydrographs using the information from the OW_PET_CORR values to colour the lines, rather than just the rather non-informative index of 1 to N. To achieve this, we add a column to our hyd_df_longer
data frame, and map the values of the ensemble (from 1-20) to the values from our sampled parameter values for OW_PET_CORR. These values can then be used directly in producing our hydrograph plot based on the OW_PET_CORR values rather than ensemble number.
hyd_df_longer$OW_PET_CORR <- plyr::mapvalues(hyd_df_longer$hyd_no, from=sprintf("hyd_%02d",seq(1,N)), to=values_OW_PET_CORR) hyd_df_longer$OW_PET_CORR <- as.numeric(hyd_df_longer$OW_PET_CORR)
Question: Create a new hydrograph plot with the ggplot2
library, where the colour is specified by the values of the OW_PET_CORR values rather than the ensemble number.
# plot hydrographs with ggplot2 ggplot(hyd_df_longer, aes(x=Date, y=flow, color=OW_PET_CORR))+ geom_line()
Question: Based on the hydrograph plot, how does the value of the OW_PET_CORR parameter affect the resulting hydrograph? Are there periods when the differences between the ensemble models are more or less pronounced?
One question that we may be interested in answering through ensembles is how the parameter values impact particular metrics. For example, we can determine how the OW_PET_CORR parameter affects metrics such as average annual peak flow values and average annual flow volume while all other parameters and inputs to the model are held fixed.
Question: Which of those two metrics experience a larger percent change relative to the base run? From a hydrological perspective, does it make sense that changing the values of a PET correction factor would affect one of these metrics more than the other? Compute the range of percent change for each metric in order to support the discussion.
avg_peak_flows <- rvn_apply_wyearly(hyd_df_xts, FUN=RavenR::cmax) %>% apply(., MARGIN=2, FUN=mean) %>% as.numeric() plot(values_OW_PET_CORR, avg_peak_flows, xlab="OW_PET_CORR", ylab="Average Annual Peak Flow (cms)", main="Average Annual Peak Flow by OW_PET_CORR values", pch=20)
base_peak_flow <- rvn_hyd_read("output_base/Hydrographs.csv")$hyd$SUB_63 %>% rvn_apply_wyearly(., FUN=RavenR::cmax) %>% apply(., MARGIN=2, FUN=mean) %>% as.numeric() # max, base, and min values of annual average peak flow c(max(avg_peak_flows), base_peak_flow, min(avg_peak_flows) ) # percent change in peak flow c( (max(avg_peak_flows) - base_peak_flow)/base_peak_flow*100, (min(avg_peak_flows) - base_peak_flow)/base_peak_flow*100 )
avg_volume <- rvn_apply_wyearly(hyd_df_xts, FUN=function(x) apply(x, 2, sum)) %>% apply(., MARGIN=2, FUN=mean) %>% as.numeric() plot(values_OW_PET_CORR, avg_volume, xlab="OW_PET_CORR", ylab="Average Annual Flow Volume (m3/s*d)", main="Average Annual Flow Volume by OW_PET_CORR values", pch=20)
base_volume <- rvn_hyd_read("output_base/Hydrographs.csv")$hyd$SUB_63 %>% rvn_apply_wyearly(., FUN=function(x) apply(x, 2, sum)) %>% apply(., MARGIN=2, FUN=mean) %>% as.numeric() # max, base, and min values of annual average peak flow c(max(avg_volume), base_volume, min(avg_volume) ) # percent change in peak flow c( (max(avg_volume) - base_volume)/base_volume*100, (min(avg_volume) - base_volume)/base_volume*100 )
\newpage
In this workflow, we see how changing a single parameter value can produce a distribution of results. In this exercise, create a new workflow that modifies the RAINSNOW_TEMP parameter and the DEP_SEEP_K parameter, in addition to the OW_PET_CORR parameter. You will need to determine an appropriate sampling distribution and range for each of these parameters.
In this workflow, undertake the following steps:
\newpage
This tutorial is meant to introduce some useful workflows in running model ensembles as a complement to the RavenR
package. If you have any comments, suggestions or bug reports, please leave a note on the issues page of the Github project (RavenR Github page), email the authors of the package, or feel free to let us know on the Raven forum.
Additional Raven materials can be found on the Raven downloads page, and additional RavenR vignettes can be found on Github in the vignettes folder.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.