library(rmarkdown) options(continue=" ") options(width=60) library(knitr) library(glmtools) library(GLM3r) knitr::opts_knit$set(root.dir = tempdir())
This vignette will give you a demonstration of the newly added automatic calibration method in glmtools
. Currently there are two optimization algorithms included: the evolutionary Covariance Matrix Adaption - Evolution Strategy (CMA-ES)
and the classical Nelder-Mead
algorithm. At the moment, you also only do a minimization of the RMSE to optimize the parameter space of given values. To avoid changing files in the package folder, we will work here in a temporary folder that will automatically deleted after you have closed R.
library(glmtools) GLM3r_folder = system.file('extdata', package = 'GLM3r') glmtools_folder = system.file('extdata', package = 'glmtools')
dir(GLM3r_folder) dir(glmtools_folder)
example.files <- dir(glmtools_folder) temp_folder <- tempdir() file.copy(list.files(glmtools_folder,full.names = T), temp_folder, overwrite = TRUE)
Note that new files have been added to the temporary directory:
dir(temp_folder)
We can use glmtools
functions to visualize and quantify the goodness of the fit of the uncalibrated setup, for instance by using the plotting and comparing functions when field data is available. But first, we need to declare the names of our output, configuration and field data variables.
out_file <- file.path(temp_folder, 'output/output.nc') field_data <- file.path(temp_folder, 'LakeMendota_field_data_hours.csv') file.copy('glm3_uncalibrated.nml', 'glm3.nml', overwrite = TRUE) nml_data <- file.path(temp_folder, 'glm3.nml')
The contents of glmtools_folder
(i.e., a glm3.nml
file and driver data including .csv
file for the meteorological drivers) are now located in a temporary folder (temp_folder
). The run_glm()
function will now return the results from the uncalibrated setup.
GLM3r::run_glm(sim_folder = temp_folder, verbose = F)
Then we can plot the uncalibrated setup and quantify the root-mean squared error between observed and simulated data.
plot_var_nc(out_file, var_name = 'temp')
plot_var_compare(nc_file = out_file, field_file = field_data,var_name = 'temp', precision = 'hours')
temp_rmse <- compare_to_field(out_file, field_file = field_data, metric = 'water.temperature', as_value = FALSE, precision= 'hours') print(paste('Total time period (uncalibrated):',round(temp_rmse,2),'deg C RMSE'))
Before we run the automatic calibration routine, we need to define certain input variables for the function. Here, we are trying to minimize the RMSE between observed and simulated water temperature data using the CMA-ES algorithm. To ensure that there is a reasonable covariance between variables, we are scaling all values on a space between 0 and 10. Further, to reduce computational times, we are stopping the algorithm either after reaching a RMSE of 1.0 degree Celsius, or after doing 100 iterations. Also, because we are working in a temporary directory, we are not saving any values. This approach will calibrate the wind_factor, the lw_factor, Kw and the sed_temp_mean values of two sediment zones with given lower and upper boundaries, as well as initial values.
We will also print out here the RMSE of every iteration, as well as diagnostic plots of the calibration, and contour maps of all time periods.
var = 'temp' # variable to which we apply the calibration procedure path = getwd() # simulation path/folder # obs = read_field_obs(field_data) # observed field data nml_file = nml_data # path of the nml configuration file glm_file = nml_data # path of the glm3.nml file calib_setup = get_calib_setup() # create a setup of variables for calibration glmcmd = NULL # command to be used, default applies the GLM3r function # Optional variables first.attempt = TRUE # if TRUE, deletes all local csv-files that stores the #outcome of previous calibration runs period = get_calib_periods(nml_file, ratio = 1) # define a period for the calibration, # this supports a split-sample calibration (e.g. calibration and validation period) scaling = TRUE # scaling of the variables in a space of [0,10]; TRUE for CMA-ES method = 'CMA-ES' # optimization method, choose either `CMA-ES` or `Nelder-Mead` metric = 'RMSE' # objective function to be minimized, here the root-mean square error target.fit = 1.0 # refers to a target fit of 1.0 degrees Celsius target.iter = 100 # refers to a maximum run of 100 calibration iterations plotting = FALSE # if TRUE, script will automatically save the contour plots output = out_file # path of the output file field_file = field_data # path of the field data conversion.factor = 1 # conversion factor for water quality (AED2 assumes mmol/m3) # main calibration function calibrate_sim(var, path, field_file, nml_file, glm_file, calib_setup, glmcmd, first.attempt, period, scaling, method, metric, target.fit, target.iter, plotting, output, conversion.factor, verbose = FALSE)
You can see that the RMSE fit of the total period was improved compared to the uncalibrated setup. You can also notice that the calibration fit is in most cases better than the validation fit, which makes sense and helps us to avoid-overfitting. The table at the end gives you the final calibrated values of all variables
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.