knitr::opts_chunk$set(echo = TRUE)
options(width=80)

Introduction

This tutorial shows how to estimate stream metabolism using streamMetabolizer and some example data supplied by Bob Hall for French Creek in Laramie, WY.

There are four steps to fitting a metabolism model in streamMetabolizer.

  1. Prepare and inspect the input data.
  2. Choose a model configuration appropriate to your data.
  3. Fit the model.
  4. Inspect the output.

In this tutorial we will demonstrate these steps for a single model structure and a single set of specifications, but you should also consider other structures and should always tailor the specifications to your dataset and site knowledge. For more details, see:

Preliminaries

If you haven't already installed the package, see the Installation tutorial.

Next load the R libraries. Only streamMetabolizer is required to run models, but we'll also be using dplyr to inspect the results.

library(streamMetabolizer)
library(dplyr)

1. Preparing the input data

Load a small example dataset from the package (data are from French Creek in Laramie, WY, courtesy of Bob Hall). We'll use the streamMetabolizer standard in defining our day to run from 4 am (day_start=4) to 4 am (day_end=28).

dat <- data_metab(num_days='3', res='15', day_start=4, day_end=28)

See the ?metab help document and the Data Preparation tutorial for more on how to properly format and inspect a dataset.

2. Configuring the model

There are two steps to configuring a metabolism model in streamMetabolizer.

a. Identify the name of the model structure you want using mm_name(). b. Set the specifications for the model using defaults fromspecs() as a starting point.

2a. Choose a model structure

For this example, we will specify a Bayesian model with both observation error and process error. We won't pool K600 here because we don't have many days of data, but pooling is one feature that makes Bayesian models better than MLE models in general. Another great feature of Bayesian models is that they produce more accurate and nuanced confidence intervals.

bayes_name <- mm_name(type='bayes', pool_K600='none', err_obs_iid=TRUE, err_proc_iid=TRUE)
bayes_name

2b. Set the specifications

We now pass the model name to specs() to get a list of default specifications for this model.

bayes_specs <- specs(bayes_name)
bayes_specs

At this point we can alter some of the specifications if desired.

# one way to alter specifications: call specs() again
bayes_specs <- specs(bayes_name, burnin_steps=100, saved_steps=200, n_cores=1, GPP_daily_mu=3, GPP_daily_sigma=2)
# another way: use revise()
bayes_specs <- revise(bayes_specs, burnin_steps=100, saved_steps=200, n_cores=1, GPP_daily_mu=3, GPP_daily_sigma=2)

Here I've used a very small number of burnin_steps and saved_steps because I don't want to wait long for the vignette to run. When you run your own models, you should bump those numbers up substantially (to several thousand or so, depending on exactly which model you're using.)

Other Bayesian model specifications will also need your close attention when you're running your own models. See Bayesian Models for details.

3. Fitting the model

Once a model has been configured, you can fit the model to data with metab(). Bayesian models take a while to run, so be patient. Or switch to an MLE model if you can afford to sacrifice some accuracy for speed. (This small example usually takes about 30 seconds on my computer.)

mm <- metab(bayes_specs, data=dat)

4. Inspecting the model

Once you've fit a model, you can inspect the output with functions including predict_metab() and plot_metab_preds(), predict_DO() and plot_DO_preds(), get_params(), and get_fit().

Start by simply printing the model to the console.

mm

Here are the daily metabolism predictions from the model:

predict_metab(mm)
plot_metab_preds(mm)

You can inspect more of the fitted daily parameters, including K600, with get_params():

get_params(mm)

Here are the first few dissolved oxygen predictions from the model (DO.mod). They are returned along with the input data for convenience.

predict_DO(mm) %>% head()

And here are the dissolved oxygen predictions in a figure:

plot_DO_preds(mm)

For Bayesian models only, you can dig even deeper using get_mcmc, which returns a stanfit object that can be inspected using the rstan package. (These traceplots are pretty bad because we used so few MCMC iterations. You should strive for better in your final models.)

mcmc <- get_mcmc(mm)
rstan::traceplot(mcmc, pars='K600_daily', nrow=3)

The get_fit() function returns a list of data.frames, one per temporal resolution, containing all fitted values and details about their distributions and convergence. Here are just the overall metrics of model convergence (Rhats, or potential scale reduction statistics; see Gelman and Rubin 1992 or Brooks and Gelman 1998):

get_fit(mm)$overall %>%
  select(ends_with('Rhat'))

And here is a list of all column names available through get_fit():

get_fit(mm) %>%
  lapply(names)

You're on your way!



USGS-R/streamMetabolizer documentation built on Aug. 15, 2023, 7:50 a.m.