knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
wateRtemp is a machine learning toolbox for mean daily stream water temperature prediction, which was used to produce all results of the publication Machine-learning methods for stream water temperature prediction (Feigl et al., 2021). Please refer to this publication for detailed descriptions of the applied models and preprocessing steps. The code used to create the figures and tables of this publication is available here and might be interesting for further analyzing wateRtemp results.
If you have any questions regarding or want to report issues with the code, please do not hesitate to create an issue under wateRtemp/issues. I will update this library whenever new issues are posted.
Last update: August 30, 2021
wateRtemp includes 6 machine learning models with bayesian hyperparameter optimization:
The main functions are:
wt_preprocessing()
for preprocessing data for the machine learning models,
wt_lm()
for muliple regression and step-wise linear regression,
wt_randomforest()
for random forests,
wt_xgboost()
for XGBoost (Extreme Gradient Boosting),
wt_fnn()
for feedforward neural networks,
wt_rnn()
for recurrent neural networks: LSTMs and GRUs
Additionally, a prepared synthetic data set for testing wateRtemp functionalities is included and can be used by running:
data("test_catchment")
You can install the released version of wateRtemp from GitHub with:
# install.packages("devtools") devtools::install_github("MoritzFeigl/wateRtemp")
This is a basic example of how to preprocess data and how to apply different machine learning algorithms using wateRtemp.
library(wateRtemp)
For using watRtemp preprocessing and modelling functions, the necessary data should be available as a data frame. Example data frames are included in wateRtemp and can be called by data("test_catchment")
.
# Provide the catchment data as a data frame data("test_catchment")
The provided data should be a data frame which includes the columns:
year
, month
, day
: The data given as three columns of integers.wt
: The mean daily stream water temperature data as numeric. For example, the test catchment has following structure:
str(test_catchment)
After loading the necessary data, we can use the watRtemp preprocessing function to apply feature engineering and data splits. The preprocessed data will be saved in the Catchment folder automatically. This function generates training and test data sets with the given fraction for the training data and automatically computes lagged version of all variables (except wt and time variables) with the given number of lags (nlags)
# Preprocess the data wt_preprocess(test_catchment, nlags = 4, training_fraction = 0.8)
After preprocessing, the corresponding training and test datasets are stored in the catchment folder and can be loaded for using them in the models.
# Preprocess the data train_data <- feather::read_feather("test_catchment/train_data.feather") test_data <- feather::read_feather("test_catchment/test_data.feather")
Now we are ready to apply our machine learning models. For this example we run the most simple model available in wateRtemp: a multiple regression model using the function wt_lm(). All results and hyperparameter optimization scores are stored in the test_catchment
folder, which is automatically created in the current working directory.
wt_lm(train_data = train_data, test_data = test_data, catchment = "test_catchment", type = "LM", cv_mode = "repCV", model_name = "standard_LM")
Similar to multiple regression, we can easily apply other type of machine learning models on the preprocessed data. All models in include Bayesian hyperparemeter optimization and automatically storing of results and trained models.
wt_lm(type = "stepLM")
for step-wise linear regression,
wt_randomforest()
for random forests,
wt_xgboost()
for XGBoost (Extreme Gradient Boosting),
wt_ann()
for feed forward neural networks,
wt_rnn()
for recurrent neural networks: LSTMs and GRUs
If you use any of this code in your experiments, please make sure to cite the following publication
@article{Feigl2021, author = {Feigl, Moritz and Lebiedzinski, Katharina and Herrnegger, Mathew and Schulz, Karsten}, doi = {10.5194/HESS-25-2951-2021}, journal = {Hydrology and Earth System Sciences}, month = {may}, number = {5}, pages = {2951--2977}, publisher = {Copernicus GmbH}, title = {{Machine-learning methods for stream water temperature prediction}}, volume = {25}, year = {2021} }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.