knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

wateRtemp - A toolbox for stream water temperature prediction

DOI R-CMD-check

wateRtemp is a machine learning toolbox for mean daily stream water temperature prediction, which was used to produce all results of the publication Machine-learning methods for stream water temperature prediction (Feigl et al., 2021). Please refer to this publication for detailed descriptions of the applied models and preprocessing steps. The code used to create the figures and tables of this publication is available here and might be interesting for further analyzing wateRtemp results.

If you have any questions regarding or want to report issues with the code, please do not hesitate to create an issue under wateRtemp/issues. I will update this library whenever new issues are posted.

Last update: August 30, 2021

Overview

wateRtemp includes 6 machine learning models with bayesian hyperparameter optimization:

The main functions are:

Additionally, a prepared synthetic data set for testing wateRtemp functionalities is included and can be used by running:

data("test_catchment")

Installation

You can install the released version of wateRtemp from GitHub with:

# install.packages("devtools")
devtools::install_github("MoritzFeigl/wateRtemp")

Example

This is a basic example of how to preprocess data and how to apply different machine learning algorithms using wateRtemp.

library(wateRtemp)

For using watRtemp preprocessing and modelling functions, the necessary data should be available as a data frame. Example data frames are included in wateRtemp and can be called by data("test_catchment").

# Provide the catchment data as a data frame
data("test_catchment")

The provided data should be a data frame which includes the columns:

For example, the test catchment has following structure:

str(test_catchment)

After loading the necessary data, we can use the watRtemp preprocessing function to apply feature engineering and data splits. The preprocessed data will be saved in the Catchment folder automatically. This function generates training and test data sets with the given fraction for the training data and automatically computes lagged version of all variables (except wt and time variables) with the given number of lags (nlags)

# Preprocess the data
wt_preprocess(test_catchment, nlags = 4, training_fraction = 0.8)

After preprocessing, the corresponding training and test datasets are stored in the catchment folder and can be loaded for using them in the models.

# Preprocess the data
train_data <- feather::read_feather("test_catchment/train_data.feather")
test_data <- feather::read_feather("test_catchment/test_data.feather")

Now we are ready to apply our machine learning models. For this example we run the most simple model available in wateRtemp: a multiple regression model using the function wt_lm(). All results and hyperparameter optimization scores are stored in the test_catchment folder, which is automatically created in the current working directory.

wt_lm(train_data = train_data, 
      test_data = test_data, 
      catchment = "test_catchment", 
      type = "LM", 
      cv_mode = "repCV", 
      model_name = "standard_LM")

Similar to multiple regression, we can easily apply other type of machine learning models on the preprocessed data. All models in include Bayesian hyperparemeter optimization and automatically storing of results and trained models.

Citation

If you use any of this code in your experiments, please make sure to cite the following publication

@article{Feigl2021,
author = {Feigl, Moritz and Lebiedzinski, Katharina and Herrnegger, Mathew and Schulz, Karsten},
doi = {10.5194/HESS-25-2951-2021},
journal = {Hydrology and Earth System Sciences},
month = {may},
number = {5},
pages = {2951--2977},
publisher = {Copernicus GmbH},
title = {{Machine-learning methods for stream water temperature prediction}},
volume = {25},
year = {2021}
}

License of the code

Apache License 2.0



MoritzFeigl/wateRtemp documentation built on Sept. 6, 2021, 6:58 a.m.