WtRegDO

Marcus W. Beck, mbeck@tbep.org

R-CMD-check build

This is the public repository of supplementary material to accompany the manuscript "Improving estimates of ecosystem metabolism by reducing effects of tidal advection on dissolved oxygen time series", published in Limnology and Oceanography Methods. The package includes a sample dataset and functions to implement weighted regression on dissolved oxygen time series to reduce the effects of tidal advection. Functions are also available to estimate net ecosystem metabolism using the open-water method.

The package can be installed from R-Universe as follows:

# Enable universe(s) by fawda123
options(repos = c(
  fawda123 = 'https://fawda123.r-universe.dev',
  CRAN = 'https://cloud.r-project.org'))

# Install the package
install.packages('WtRegDO')

Citation

Please cite this package using the manuscript.

Beck MW, Hagy III JD, Murrell MC. 2015. Improving estimates of ecosystem metabolism by reducing effects of tidal advection on dissolved oxygen time series. Limnology and Oceanography Methods. 13(12):731-745. DOI: 10.1002/lom3.10062

Functions

A sample dataset, SAPDC, is included with the package that demonstrates the required format of the data. All functions require data with the same format, with no missing values in the tidal depth column. See the help files for details.

# load library and sample data
library(WtRegDO)
head(SAPDC)

Before applying weighted regression, the data should be checked with the evalcor function to identify locations in the time series when tidal and solar changes are not correlated. In general, the wtreg function for weighted regression will be most effective when correlations between the two are zero, whereas wtreg will remove both the biological and physical components of the dissolved oxygen time series when the sun and tide are correlated. The correlation between tide change and sun angle is estimated using a moving window for the time series. Tide changes are estimated as angular rates for the tidal height vector and sun angles are estimated from the time of day and geographic location. Correlations are low for the sample dataset, suggesting the results from weighted regression are reasonable for the entire time series.

data(SAPDC)

# metadata for the location
tz <- 'America/Jamaica'
lat <- 31.39
long <- -81.28

# setup parallel backend
library(doParallel)
ncores <- detectCores()  
registerDoParallel(cores = ncores - 1)

# run the function
evalcor(SAPDC, tz, lat, long, progress = TRUE)

The wtreg function can be used to detide the dissolved oxygen time series. The example below demonstrates detiding, following by a comparison of ecosystem metabolism using the observed and detided data.

# run weighted regression in parallel
# requires parallel backend
library(doParallel)
ncores <- detectCores()  
registerDoParallel(cores = ncores - 1)

# metadata for the location
tz <- 'America/Jamaica'
lat <- 31.39
long <- -81.28

# weighted regression, optimal window widths for SAPDC from the paper
wtreg_res <- wtreg(SAPDC, parallel = TRUE, wins = list(3, 1, 0.6), progress = TRUE, 
  tz = tz, lat = lat, long = long)

# estimate ecosystem metabolism using observed DO time series
metab_obs <- ecometab(wtreg_res, DO_var = 'DO_obs', tz = tz, 
  lat = lat, long = long)

# estimate ecosystem metabolism using detided DO time series
metab_dtd <- ecometab(wtreg_res, DO_var = 'DO_nrm', tz = tz, 
  lat = lat, long = long)

The meteval function provides summary statistics of metabolism results to evaluate the effectiveness of weighted regression. These estimates are mean production, standard deviation of production, percent of production estimates that were anomalous, mean respiration, standard deviation of respiration, percent of respiration estimates that were anomalous, correlation of dissolved oxygen with tidal height changes, correlation of production with tidal height changes, and the correlation of respiration with tidal height changes. The correlation estimates are based on an average of separate correlations by each month in the time series. Dissolved oxygen is correlated directly with tidal height at each time step. The metabolic estimates are correlated with the tidal height ranges during the day for production and during the night for respiration.

In general, useful results for weighted regression are those that remove the correlation of dissolved oxygen, production, and respiration with tidal changes. Similarly, the mean estimates of metabolism should not change if a long time series is evaluated, whereas the standard deviation and percent anomalous estimates should decrease.

data(metab_obs)
data(metab_dtd)
# evaluate before weighted regression
meteval(metab_obs)

# evaluate after weighted regression
meteval(metab_dtd)

Plot metabolism results from observed dissolved oxygen time series (see ?plot.metab for options). Note the periodicity with fortnightly tidal variation and instances with negative production/positive respiration.

plot(metab_obs, by = 'days')

Plot metabolism results from detided dissolved oxygen time series.

data(metab_dtd)
plot(metab_dtd, by = 'days')

Optimization

A critical component of weighted regression is choosing the window widths. The chosen values affect the relative degree of smoothing in both predicted and detided dissolved oxygen time series. There are no strict rules for choosing window widths, but several rules of thumb can be applied that assess the relative ability of the method in removing the tidal signal in metabolism estimates. The meteval function measures many of these rules of thumb and optimization functions have been included in WtRegDO that attempt to identify window widths to achieve detided time series that satisfy these rules.

The winopt function attempts to find optimal window widths for a given time series. There are several functions used internally within winopt that quantify relative ability of a detided metabolism estimate to achieve the rules of thumb for a desired result. Specifically, improved estimates are assumed to have lower anomalies (less negative production and positive respiration values), lower standard deviation, and similar mean values for gross production and respiration between the observed and detided estimates. The detided and observed metabolism estimates are compared during the optimization routine using the objfun function that returns a single numeric value that is to be optimized as a measure of how well a detided metabolism estimate achieves the rules of thumb.

The objfun function evaluates the detided metabolism estimates based on a sum of percent differences for the six paired measures for percent anomalous production, percent anomalous respiration, mean production, mean respiration, standard deviation of production, and standard deviation of respiration for the estimates from the observed and detided metabolism. The comparisons of the means are taken as the inverse (1 / mean) such that optimization should attempt to keep the values as similar as possible. The final sum is multiplied by negative one such that the value is to be optimized by minimization, i.e., a lower value indicates improved detiding across all measures. The function can also quantify a comparison based on different measures supplied by the user. By default, all six measures are used. However, selecting specific measures, such as only optimizing by reducing anomalous values, may be preferred. Changing the argument for vls changes which comparisons are used for the summary value. In practice, choosing less rules of thumb to optimize is more likely to lead to an obtainable result for minimizing the objective function.

Using the winopt function requires similar inputs as the wtreg function.

# run optimiztaion in parallel
# requires parallel backend
library(foreach)
library(doParallel)

ncores <- detectCores()
cl <- makeCluster(ncores)
registerDoParallel(cl)

data(SAPDC)

tz <- 'America/Jamaica'
lat <- 31.39
long <- -81.28

# find optimal window widths for reducing anomalous metabolism estimates
winopt(SAPDC, tz = tz, lat = lat, long = long, wins = list(6, 6, 0.5), parallel = T, vls = c('AnomPg', 'AnomRt'))

stopCluser(cl)

The optimization function can take several hours to run and, if it works, should return three window widths for day, hour, and tide that achieve the rules of thumb. The above example should return window widths that minimize only the anomalous metabolism estimates. It's worth noting that the "optimization surface" for the objective function is very irregular and the optimization function may not converge to a solution or may be trapped in a local minima. For these reasons, it's often easier and less time consuming to create a regular grid of window widths to search and use the objfun function separately to manually identify an approximate solution. An example for using a grid search can be found here.

License

This package is released in the public domain under the creative commons license CC0.



fawda123/WtRegDO documentation built on March 18, 2024, 9:04 p.m.