Preliminaries

As demonstrated in the other vignettes, we have to get the basic flow and concentration data into a suitable format as a first step.

library(lubridate)
library(dplyr)

discharge_file     <- system.file( "extdata", "ROCKFLOW.DAT", package="autobeale")
concentration_file <- system.file( "extdata", "ROCKNO23.DAT", package="autobeale")

discharge_data     <- read.table( discharge_file, header=TRUE )
concentration_data <- read.table( concentration_file, header=TRUE )

discharge_conc_df  <- NULL
discharge_conc_df  <- dplyr::left_join(discharge_data, concentration_data, by=c("date" = "date") )

discharge_conc_df$discharge <- cfs_to_cms( discharge_conc_df$discharge )
names( discharge_conc_df )  <- c("date","discharge_cms","concentration_mg_L")

discharge_conc_df$date <- lubridate::ymd( discharge_conc_df$date )

Exploring the effect of partitions on the calculated load

Next we wire up a genetic algoritm package to test alternative partitioning strategies. Package mcga uses its evalFunc argument to pass a vector of fractions that define the partitioning scheme to the strata object, which evaluates the Beale load and returns the associated RMSE estimate. The vector of date fractions corresponds to the partition dates that will be used in the Beale calculations.

The genetic algorithm package, in turn, compares the current RMSE value with other previously calculated values (associated with other partitioning strategies), and 'evolves' a new partitioning strategy to test. This process is repeated hundreds or thousands of times in an attempt to find a optimum partitioning strategy.

library(mcga)
library(autobeale)

# create new strata object
mystrata <- strata$new( num_stratums=5, q_conc_df=discharge_conc_df )

# allow genetic algorithm to run the Beale calculation repeatedly, altering partition
# boundaries, and evaluating the RMSE
m <- mcga( popsize=200,
           chsize=4,
           minval=0.0,
           maxval=1.0,
           elitism=10,
           maxiter=40,
           crossprob=1.0,
           mutateprob=0.05,
           evalFunc=mystrata$update_rmse )

cat("Best chromosome:\n")
cat(m$population[1,],"\n")
cat("Cost: ",m$costs[1],"\n")

# create a copy of the strata object
best_strata <- mystrata$clone()

# rerun with the optimized parameter
best_strata$partitions$update( sort( m$population[ 1, ]) )
best_strata$rearrange_stratums( best_strata$partitions$get_date_values() )

best_strata$calc_loads()
results_df <- best_strata$summarize_results()

# round values to produce a more readable table
results_df <- data.frame(lapply(results_df, function(y) if(is.numeric(y)) round(y, 2) else y)) 

The results of the load calculation with optimum stratification using 5 stratums now looks like this:

knitr::kable( results_df, digits=2 )


smwesten-usgs/autobeale-R documentation built on May 30, 2019, 5:04 a.m.