R/e2e_run_mc.R

Defines functions e2e_run_mc

Documented in e2e_run_mc

#
# e2e_run_mc.R
#
#' Run a Monte Carlo simulation with StrathE2E and derive centiles of credible values of model outputs.
#'
#' The Monte Carlo scheme provided with the StrathE2E2 package has two modes of operation. The first is referred to as
#' ‘baseline mode’. This involves generating a list of ecology model parameter sets and, for each set, determining the
#' likelihood of observational data consistent with the environmental and fishery drivers. The model outputs
#' generated with each set are then weighted by the likelihood before deriving quantiles of their distribution.
#' The quantile ranges then represent credible intervals of the model outputs.
#'
#' The second mode of operation is referred to as the ‘scenario mode’. In this case, the parameter sets and their
#' associated likelihoods from a baseline mode simulation, are used to generate model outputs for scenarios of
#' environmental or fishing drivers – e.g. increased activity by selected gears, or warmer sea temperatures.
#' These scenario outputs are then weighted by the baseline mode likelihoods before derivation of quantiles
#' and credible intervals.
#'
#' In baseline mode, the Monte Carlo scheme generates an ensemble of model runs by sampling the ecology model parameters from uniform distibutions
#' centred on an initial (baseline) parameter set. Ideally, this should be the maximum likelihood parameter set arrived at by prior application the various 
#' optimization functions in the package to 'fit' the model to a suite of observational data on the state of ecosystem
#' (target data for fitting are in /Modelname/Variantname/Target/annual_observed_*.csv).
#' Each iteration in the Monte Carlo scheme then generates a unique time series of model outputs at daily intervals,
#' together with the overall likelihood of the obervational target data. From these data, the function then derives credible intervals of each output by
#' weighting the values from each parameter set by the associated likelihood.
#'
#' Running a scenario mode simulation requires a baseline model to have been previously completed, in order to generate a list of parameter sets and associated
#' likelihoods. In scenario mode, model driving data (e.g. temperatures or fishing activities) are varied from the baseline configuration by editing
#' the model object generated by a e2e_read() function call, and the model is then run with the existing baseline model parameter sets, and the results
#' weighted by the baseline mode likelihoods in order to derive the credible intervals.
#'
#' The choice of baseline vs scenario mode of simulations is determined in the function by a logical argument setting 'baseline.mode'.
#' 
#' In baseline mode, the coefficients of variation for jiggling the ecology parameter can be varied in real-time
#' during the run by editing the file "monte_carlo.csv" in the folder
#' /Param/control/ of the model version. However, it is recommended to leave the setting constant for the duration of a run.
#' A CV of 0.10 to 0.15 is recommended. If comparing the credible intervals for two different baseline models then it is important to use the same CV in both cases.
#'
#' In scenario model, the parameter sets are pre-ordained by a prior baseline simulation, so the CV setting is ineffective.
#'
#' On completion of all the iterations of the model, whether in baseline or scenarion mode, for each model output variable in turn, values from the individual runs and
#' their associated model likelihoods are assembled as a list of paired values. The list is then sorted by ascending values of the model variable,
#' and the cumulative likelihoods with increasing value of model variable is calculated. Values for the model variable at standard proportions
#' (0.005, 0.25, 0.5, 0.75, 0.995) of the maximum cumulative likelhood are then extracted by interpolation.
#' These represent the credible intervals of model output given uncertainty in the ecology model parameters.
#'
#' Outputs from the e2e_run_mc() function depend on whether post-processing is selected. If not, then raw data are saved as csv files
#' (provided that the argument csv.output=TRUE), and the function also returns a dataframe of the parameter sets used in the run and their corresponding likelihoods. In scenario mode this is just a replica of the 
#' baseline mode data that have been used to run the simulation. If post-processing is selected, then 
#' both the raw and processed credible interval data are saved to csv files (provided that the argument csv.output=TRUE), and the function returns a list object which includes the parameter
#' dataframe plus all the processed credible interval data structures. csv files are saved in the folder /results/Modelname/Variantname/CredInt/ with an
#' identifier for the simulation (model.ident) created by the e2e_read() function. There are two types of output. First is simply an
#' accumulation of all the standard outputs from StrathE2E on a run-by-run basis. The second is a set of files containing the derived credible intervals.
#'
#' The function displays a real-time graphic to show the progress of the simulation. During the StrathE2E-running phase of the
#' process an x-y graph is updated after each iteration (x-axis=iterations, y-axis=Likelihood of the target data), with a horizontal grey line showing the
#' baseline (maximum liklihood) result, and black symbols showing the likelihood for each iteration based on the parameter values sampled from the baseline.
#' The y-axis limits for the graph can be varied in real-time
#' during the run by editing the file "monte_carlo.csv" in the folder /Param/control/ of the model version.
#'
#' During the post-processing phase, a message is displayed as each output variable is processed.
#'
#' WARNING - the scheme can take a long time to run (~2 days with the default settings), and generate large output files (11 files total 3.2 Mb per iteration). 
#' Check the memory capacity of you machine before starting a long run. It can make sense to 
#' parallelize the process by splitting across different processors and merging the results afterwards. The package contains a function for merging cumulative output files from 
#' multiple runs and post-processing the combined data.
#'
#' In baseline mode, parallel instances of the function can be implemented on different processors, all starting from the same
#' initial parameter set. During the subsequent merging process, the outputs from the first parameter set, of all but the first instance, are stripped away and discarded.
#'
#' Impementing parallel runs in scenario mode requires a different approach. Here, the parameter sets are pre-ordained and it is important to avoid using duplicate sets in the simulations.
#' Hence, the function includes an argument 'begin.sample' to select a pointer in the baseline parameter set to begin sampling in any scenario mode run. For example, in the first instance, begin.sample=1, and the number
#' of iterations (n_iter) might be set to e.g. 200. For the second instance the pointer (begin.sample) would then be set to 201, and so on. If begin.sample > 1, the function runs the first baseline parameter set (sample 1)
#' first before proceeding to the pointer begin.sample and completing the requested number of iterations, so there will be one extra iteration in the output files. If the requested number of iterations 
#' means that sampling would over the end of available number of parameter sets
#' then the number of iterations is reduced accordingly.
#'
#' Although the e2e_run_mc() function generates large raw data files (11 files total 3.2 Mb per iteration), the processed data are much smaller (13 files total ~4 Mb regardless of
#' the nunber of iterations).
#'
#' @param model R-list object defining the model configuration (baseline or scenario), compiled by the e2e_read() function.
#' @param nyears Number of years to run each instance of the model. Needs to be long enough to allow the model to attain a stationary state (default=50).
#' @param baseline.mode Logical. If TRUE then the simulation adopts a baseline mode. If FALSE then scenario mode (default=TRUE).
#' @param use.example.baseparms Logical. Value required only if baseline.mode=FALSE. If TRUE then the baseline parameters set is drawn from example data provided with the package. If FALSE then a user generated baseline parameter set is expected. 
#' @param baseparms.ident  Default = "". Value required if baseline.run=FALSE and use.example.baseparms=FALSE, in which case this is the model.ident string of the a user generated baseline parameter set. Remember to include the phrase within "" quotes.
#' @param begin.sample Default = 1. Value required if baseline.mode=FALSE. Value sets the first row in the file of baseline parameter data from which the sequence of parameters sets to be used in this run will be taken. Set a value > 1 if this run is part of a parallel set of runs.
#' @param n_iter Number of iterations of the model (default=1000). If baseline.mode=FALSE and the number of sets in the supplied baseline parameter file beyond the value of begin.sample is less than n_iter, then n_iter is reset to the remaining number available.
#' @param csv.output Logical. If TRUE then enable writing of csv output files (default=FALSE).
#' @param runtime.plot Logical. If FALSE then disable runtime plotting of the progress of the run - useful for testing (default=TRUE).
#' @param postprocess Logical. If FALSE then disable post-processing of the cumulative outputs, for example in parallel mode where multiple runs are going to be combined later (default=TRUE).
#'
#' @return Depends on argument settings. List object containing processed data if postprocess=TRUE, otherwise a NULL object; csv files of raw and if processed data if csv.output=TRUE; real-time graphical displays during the simulation if runtime.plot=TRUE
#'
#' @seealso \code{\link{e2e_read}}, \code{\link{e2e_merge_sens_mc}}, \code{\link{e2e_process_sens_mc}}, \code{\link{e2e_plot_sens_mc}}
#'
#' @importFrom stats runif
#' @importFrom grDevices dev.off
#'
#' @export
#'
#' @examples
#' # The examples provided here are illustration of how to set up and run Monte Carlo
#' # analyses. Even though they are stripped-down minimalist examples they each still
#' # take 10-15 minutes to run.
#'
#' # --------------------------------------------------------------------------
#'
#' \donttest{
#' # A quick demonstration of baseline mode with postprocessing, but csv output disabled.
#' # Load the 1970-1999 version of the North Sea model supplied with the package
#' # Run the Monte Carlo process and generate some data but disable csv output.
#' #    model <- e2e_read("North_Sea", "1970-1999")
#' #    demo <- e2e_run_mc(model,nyears=2,baseline.mode=TRUE,n_iter=5,csv.output=FALSE)
#' # View the structure of the returned list:
#' #    str(demo,max.level=1)
#' # View the structure of a returned list element containing sub-sets:
#' #    str(demo$CI_annual_avmass,max.level=1)
#' # View the structure of a returned list element containing sub-sets:
#' #    str(demo$CI_annual_fluxes,max.level=1)
#' # View the top few rows on the whole domain data on annual averge mass:
#' #    head(demo$CI_annual_avmass$whole)
#' }
#'
#' # --------------------------------------------------------------------------
#'
#' # Dummy example to illustrate a more meaningful run. Output directed to 
#' # "../Folder/results" relative to the current working directory (REPLACE with your
#' # own results.path before running):
#' #    model <- e2e_read("North_Sea", "1970-1999",results.path="Folder/results")
#' #    mc_results <- e2e_run_mc(model,baseline.mode=TRUE,nyears=50,n_iter=1000,csv,output=TRUE)
#' # WARNING: This run will take about 48h to complete, much better to split up and spread
#' # across multiple processors !
#'
#' # --------------------------------------------------------------------------
#'
#' \donttest{
#' # A quick demonstration run in baseline mode, save the data to a temporary folder:
#'     basemodel <- e2e_read("North_Sea", "1970-1999",model.ident="mcbaseline")
#'     basedemo <- e2e_run_mc(basemodel,nyears=2,baseline.mode=TRUE,
#'                            n_iter=5,csv.output=TRUE)
#' }
#' 
#' # --------------------------------------------------------------------------
#
#' \donttest{
#' # Then a quick demonstration run in scenario mode using the saved baseline parameter data,
#' # from the previous example, and save to csv. It is assumed that the baseline parameter 
#' # data are in the temporary folder in which they were created, and that the same temporary
#' # folder is used for this example.
#' # First create an extreme fishing scenario - quadruple some gear activities, run for 10 years
#'     scenariomodel<-basemodel
#'     scenariomodel$setup$model.ident <- "mcscenario"
#' # Gear 1 (Pelagic trawls) activity rate rescaled to 4*baseline:
#'     scenariomodel$data$fleet.model$gear_mult[1] <- 4
#' # Gear 4 (Beam_Trawl_BT1+BT2) activity rate rescaled to 4*baseline:
#'     scenariomodel$data$fleet.model$gear_mult[4] <- 4 
#'     scendemo <- e2e_run_mc(scenariomodel,nyears=2,baseline.mode=FALSE, 
#'                            use.example.baseparms=FALSE, baseparms.ident="mcbaseline",
#'                            begin.sample=1, n_iter=5,csv.output=TRUE)
#' 
#' # Compare the results of the baseline and scenario:
#'     basemodel <- e2e_read("North_Sea", "1970-1999", model.ident="mcbaseline")
#'     scenariomodel <- e2e_read("North_Sea","1970-1999",model.ident="mcscenario")
#'     e2e_compare_runs_box(selection="ANNUAL", model1=basemodel, ci.data1=TRUE, use.saved1=TRUE,
#'                          model2=scenariomodel, ci.data2=TRUE, use.saved2=TRUE )
#'     dev.new()
#'     e2e_compare_runs_box(selection="MONTHLY", model1=basemodel, ci.data1=TRUE, use.saved1=TRUE,
#'                          model2=scenariomodel, ci.data2=TRUE, use.saved2=TRUE )
#' }
#'
#' # --------------------------------------------------------------------------
#'
#' \donttest{
#' # Quick demonstration of parallelizing the process in baseline mode, with output folder
#' # to a temporary folder. To explore further details of results.path="YourFolder"
#' # relative to the current working directory.
#' # Launch batch 1 (on processor 1):
#'     model1 <- e2e_read("North_Sea", "1970-1999",  model.ident="BATCH1")
#'     results1 <- e2e_run_mc(model1,nyears=2,baseline.mode=TRUE,
#'                            n_iter=5,csv.output=TRUE,postprocess=FALSE)                
#' # Launch batch 2 (on processor 2):
#'     model2 <- e2e_read("North_Sea", "1970-1999", model.ident="BATCH2")
#'     results2 <- e2e_run_mc(model2,nyears=2,baseline.mode=TRUE,
#'                            n_iter=6,csv.output=TRUE,postprocess=FALSE) 
#' # Note that these two runs return only raw data since postprocess=FALSE
#' 
#' # Note that Batch 2 requests 6 iterations, rather that 5 in Batch 1.
#' # The number of iterations do not have to be the same in each batch.
#' # However, the first in each batch has to use the initial parameter set from the model setup,
#' # as this is the parent for all the subsequent samples generated during the run.
#' # When we come to merge the data from separate batches, data from the first iteration are
#' # stripped off and discarded for all but the first batch as we do not want to include duplicate
#' # data in the combined files. Hence we choose 6 iterations here in Batch 2 to make the point,
#' # and we expect the combined data to include 10 iterations.
#' #
#' # Then, afterwards, merge the two raw results files with text-tags BATCH1 and BATCH2,
#' # and post process the combined file:
#'     model3 <- e2e_read("North_Sea", "1970-1999", model.ident="COMBINED")
#'     processed_data <- e2e_merge_sens_mc(model3, selection="MC",
#'                       ident.list<-c("BATCH1","BATCH2"), postprocess=TRUE, csv.output=TRUE)
#' # or...
#'     combined_data <- e2e_merge_sens_mc(model3, selection="MC",
#'                      ident.list<-c("BATCH1","BATCH2"), postprocess=FALSE, csv.output=TRUE)
#'     processed_data<-e2e_process_sens_mc(model3,selection="MC",use.example=FALSE,
#'                                         csv.output=TRUE)
#' 
#' # Plot the parameter likelihood distributions from the combined data
#'     e2e_plot_sens_mc(model3, selection="MC")
#' }
#'
#' # --------------------------------------------------------------------------
#'
#' \donttest{
#' # Example of parallelizing the process in scenario mode, using the baseline parameter
#' # sets 'COMBINED' from above (assuming that this is sitting in the same temporary folder
#' # in which it was created in the example above.
#' # The activity of all fishing gears is reduced to zero to create a no-fishing scenario.
#' # Run each batch for 10 years as a relatively quick demo - a real run would need to
#' # run for at least 40 year
#' # Launch batch 1 (in processor 1):
#'     model1s <- e2e_read("North_Sea", "1970-1999", model.ident="BATCH1_S")
#' # Activity rates of all 12 gears rescaled to 0*baseline:
#'     model1s$data$fleet.model$gear_mult[1:12] <- 0
#'     results1s <- e2e_run_mc(model1s,nyears=2,baseline.mode=FALSE, baseparms.ident="COMBINED",
#'                  begin.sample=1, n_iter=5,csv.output=TRUE,postprocess=FALSE)                
#' # Launch batch 2 (on processore 2):
#'     model2s <- e2e_read("North_Sea", "1970-1999", model.ident="BATCH2_S")
#' # Activity rates of all 12 gears rescaled to 0*baseline:
#'     model2s$data$fleet.model$gear_mult[1:12] <- 0
#'     results2s <- e2e_run_mc(model2s,nyears=2,baseline.mode=FALSE, baseparms.ident="COMBINED",
#'                  begin.sample=6, n_iter=5,csv.output=TRUE,postprocess=FALSE)
#' # Note that Batch 1 samples rows 1:5 of the baseline mode parameter set archive "COMBINED"
#' # (begin.sample=1, n_iter=5), so Batch 2 needs to start sampling at row 6 (begin.sample=6).
#' # The baseline archive contains 10 rows, so Batch 2 has the capacity to undertake up to 5
#' # sample iterations (rows 6:10). If we select more than 5 iterations (e.g. n_iter=8) then
#' # the code will automatically restrict to 5. Note that in fact, to be consistent with the 
#' # format of output files from the baseline mode, each scenario mode run where 
#' # 'begin.sample' > 1 will complete n_iter+1 iterations, by adding an initial run using the 
#' # parameter values from row 1 of the baseline parameter set - which is then stripped off 
#' # during merging. 
#' #
#' # Then, merge the two raw results files with text-tags BATCH1_S and BATCH2_S, and post
#' # process the combined file:
#'     model3s <- e2e_read("North_Sea", "1970-1999", model.ident="COMBINED_S")
#'     processed_datas <- e2e_merge_sens_mc(model3s, selection="MC",
#'                        ident.list<-c("BATCH1_S","BATCH2_S"), postprocess=TRUE, csv.output=TRUE)
#' 
#' # Finally plot comparisons of the baseline and scenario model runs:
#'     e2e_compare_runs_box(selection="ANNUAL", model1=model3, ci.data1=TRUE, use.saved1=TRUE,
#'                          model2=model3s, ci.data2=TRUE, use.saved2=TRUE )
#'     dev.new()
#'     e2e_compare_runs_box(selection="MONTHLY", model1=model3, ci.data1=TRUE, use.saved1=TRUE,
#'                          model2=model3s, ci.data2=TRUE, use.saved2=TRUE )
#' }
#'
#' # --------------------------------------------------------------------------
#'
#
# ---------------------------------------------------------------------
# |                                                                   |
# | Authors: Mike Heath, Ian Thurlbeck                                |
# | Department of Mathematics and Statistics                          |
# | University of Strathclyde, Glasgow                                |
# |                                                                   |
# | Date of this version: May 2020                                    |
# |                                                                   |
# ---------------------------------------------------------------------

e2e_run_mc <- function(model, nyears=50, baseline.mode=TRUE, use.example.baseparms=FALSE, baseparms.ident="", begin.sample=1, n_iter=1000, csv.output=FALSE, runtime.plot=TRUE, postprocess=TRUE) {

   oo <- options()
   on.exit(options(oo))

	if (use.example.baseparms == TRUE) hasExamples()

if(runtime.plot==TRUE){
start_par = par()$mfrow
dev.off()
on.exit(par(mfrow = start_par))
}

	pkg.env$csv.output <- csv.output	# controls writing of CSV files

	setup			<- elt(model, "setup")
	model.path		<- elt(setup, "model.path")
	model.name 		<- elt(model, "setup", "model.name")
	model.variant 		<- elt(model, "setup", "model.variant")
	resultsdir		<- elt(setup, "resultsdir")
	identifier		<- elt(setup, "model.ident")

	data			<- elt(model, "data")
	fitted.parameters	<- elt(data, "fitted.parameters")

	credpath		<- makepath(resultsdir, CREDINT_DIR)

if(baseline.mode==FALSE){

        if(use.example.baseparms==TRUE){
		scenario_parameters <- get.example.results(model.name, model.variant, "CredInt_cumulative_parameters", CREDINT_DIR)
	}

        if(use.example.baseparms==FALSE){
		if(baseparms.ident==""){
			stop("Error: Scenario run with user-generated parameter input selected, but no baseline parameter file ident defined !\n")
		}
		parmfile <- csvname(credpath, "CredInt_cumulative_parameters", baseparms.ident)
		check.exists(parmfile)
		message("Reading stored parameter values from '", parmfile)

		scenario_parameters <- readcsv(parmfile)
	}

	# The parameters to be used in the scenario run are now in the dataframe scenario_parameters
	# First column is iteration number in the baseline run
	# Last column is the likelihood of the target data given teh parameter set in the baseline run

	parmcols<-(2:(ncol(scenario_parameters)-1))
	if(length(parmcols) != length(fitted.parameters)){
	stop("Error: Something is wrong with the baseline parameter file (wrong number of columns) ! \n")
	}

	checkparms<-0
	for(jj in 1:length(fitted.parameters)){
	if(fitted.parameters[jj] != scenario_parameters[1,jj+1]) checkparms<-checkparms+1
	}
	if(checkparms>0){
		message("Warning: ",checkparms," parameters in the first row of the baseline set differ from those in the model setup !")
	}

	parmrows<-nrow(scenario_parameters)

	if(begin.sample >= parmrows){
	stop("Requested start row in the baseline parameter file is greater than the available number of rows ! \n")
	}

	xend.sample<-begin.sample + n_iter - 1

	if(xend.sample>parmrows){
	end.sample<-parmrows
	n_iter<-end.sample - begin.sample + 1
	message("Warning: Fewer remaining baseline parameter sets available than the requested number of iterations. n_iter reset to ",n_iter," plus baseline row 1 if necessary !")
	}

	if(xend.sample<=parmrows){
	end.sample<-xend.sample
	}

	sample.seq<- (begin.sample : end.sample)  # Sets the sequence of rows to be drawn from the baseline parameter file if baseline.mode=FALSE

	if(begin.sample>1){                       # Except that if begin.sample > 1 then always add an extra iteration with parameters from row 1 at the start
	sample.seq<-c(1,sample.seq)
	n_iter<-n_iter+1	
	}

}



	print(date())

	datastore <- c(
		fitted.parameters,
		annual_obj = 1e-60
	)

	for (kkk in 1:n_iter){

		pc_complete <- kkk*100/n_iter
		message("\n",sprintf("%.3f",pc_complete),"% complete; iteration: ",kkk," of ",n_iter,"     ",appendLF=TRUE)

		annealing.parms <- read_SD_credint(model.path)		# read every loop so control is available during a run

		if(kkk==1){
			axmin <- elt(annealing.parms, "axmin")
			axmax <- elt(annealing.parms, "axmax")
			axmin_p<-axmin
			axmax_p<-axmax
			newglims<-1
		}
		if(kkk>1){
			newglims<-0
			axmin <- elt(annealing.parms, "axmin")
				if((axmin==axmin_p)==FALSE){
					newglims<-1
				}
			axmax <- elt(annealing.parms, "axmax")
				if((axmax==axmax_p)==FALSE){
					newglims<-1
				}
			axmin_p<-axmin
			axmax_p<-axmax
		}
		
		if( axmax<=axmin ) {
		message("Warning: Plot axis max <= axis min in the monte_carlo.csv control file - reset to defaults")
		axmax<-1
		axmin<-0
		newglims<-1
		}

		if( elt(annealing.parms, "Prefsd") < 0 ) {
		stop("Error: range setting for preference parameter variations must be positive.")
		}
		if( elt(annealing.parms, "u_sd") < 0 ) {
		stop("Error: range setting for maximum uptake rate parameter variations must be positive.")
		}
		if( elt(annealing.parms, "h_sd") < 0 ) {
		stop("Error: range setting for uptake half saturation parameter variations must be positive.")
		}
		if( elt(annealing.parms, "biogeo_sd") < 0 ) {
		stop("Error: range setting for biogeochemistry parameter variations must be positive.")
		}
		if( elt(annealing.parms, "mort_sd") < 0 ) {
		stop("Error: range setting for mortality parameter variations must be positive.")
		}
		if( elt(annealing.parms, "ressd") < 0 ) {
		stop("Error: range setting for other parameter variations must be positive.")
		}

		if (kkk==1) annealing.parms[] <- 0.0

		#....................................................................
		if(baseline.mode==TRUE){
		 perturbed <- perturb_parameters_all(datastore, annealing.parms)
		# perturbed parameters built into model.parameters for run except for the first run  :
		if(kkk>1) model$data$fitted.parameters <- perturbed[1:(length(perturbed)-1)]
		}
		#....................................................................
		if(baseline.mode==FALSE){
		parameters_to_use <- scenario_parameters[sample.seq[kkk],parmcols]
		# parameters to be used built into model.parameters for run  :
		model$data$fitted.parameters <- parameters_to_use
		}
		#....................................................................


		results <- e2e_run(model, nyears=nyears, csv.output=FALSE)

		#....................................................................
		if(baseline.mode==TRUE){
		annual_obj <- elt(results, "final.year.outputs", "annual_obj")
		if(kkk==1) Results_to_store<-c(iteration=kkk,datastore[1:(length(datastore)-1)],likelihood=annual_obj)
		if(kkk>1) Results_to_store<-c(iteration=kkk,perturbed[1:(length(perturbed)-1)],likelihood=annual_obj)
		}
		#....................................................................
		if(baseline.mode==FALSE){
		annual_obj <- scenario_parameters[sample.seq[kkk],ncol(scenario_parameters)]
		Results_to_store<-scenario_parameters[sample.seq[kkk],]
		}
		#....................................................................


		#Merge derived outputs with iteration number and liklihood

		opt_results <- elt(results, "final.year.outputs", "opt_results")

		opt_results_to_store<-c(iteration=kkk,annual_obj,opt_results[,3])
		names(opt_results_to_store)<-c("iteration","likelihood",as.vector(opt_results$Description))

		monthlyfinal <- elt(results, "final.year.outputs", "monthly.averages")
		monthlyfinal$month<-seq(1,12)
		monthlyfinal$iteration<-kkk
		monthlyfinal$likelihood<-annual_obj

		#-------------------------------------------------------------------------------------------------------

		#Extract the final year of model output and volumetric data to be saved for later analysis
		lastyear <- extract_final_year_of_run_plus_volumetric_data(model, results)

		lastyear$iteration<-kkk
		lastyear$likelihood<-annual_obj

		#--------------------------------------------------------------------------------

		mass_results <- elt(results, "final.year.outputs", "mass_results_inshore")
		annual_flux_results <- elt(results, "final.year.outputs", "annual_flux_results_inshore")
		inshoreaamass<-mass_results[,1]
		inshoreaamass<-c(kkk,annual_obj,inshoreaamass)
		names(inshoreaamass)<-c("iteration","likelihood",mass_results$Description)
		inshore_annual_flux<-annual_flux_results[,1]
		inshore_annual_flux<-c(kkk,annual_obj,inshore_annual_flux)
		names(inshore_annual_flux)<-c("iteration","likelihood",annual_flux_results$Description)

		mass_results <- elt(results, "final.year.outputs", "mass_results_offshore")
		annual_flux_results <- elt(results, "final.year.outputs", "annual_flux_results_offshore")
		offshoreaamass<-mass_results[,1]
		offshoreaamass<-c(kkk,annual_obj,offshoreaamass)
		names(offshoreaamass)<-c("iteration","likelihood",mass_results$Description)
		offshore_annual_flux<-annual_flux_results[,1]
		offshore_annual_flux<-c(kkk,annual_obj,offshore_annual_flux)
		names(offshore_annual_flux)<-c("iteration","likelihood",annual_flux_results$Description)

		mass_results <- elt(results, "final.year.outputs", "mass_results_wholedomain")
		annual_flux_results <- elt(results, "final.year.outputs", "annual_flux_results_wholedomain")
		wholeaamass<-mass_results[,1]
		wholeaamass<-c(kkk,annual_obj,wholeaamass)
		names(wholeaamass)<-c("iteration","likelihood",mass_results$Description)
		whole_annual_flux<-annual_flux_results[,1]
		whole_annual_flux<-c(kkk,annual_obj,whole_annual_flux)
		names(whole_annual_flux)<-c("iteration","likelihood",annual_flux_results$Description)

		networkdata_results <- elt(results, "final.year.outputs", "NetworkIndexResults")
		networkdata<-networkdata_results[,1]
		networkdata<-c(kkk,annual_obj,networkdata)
		names(networkdata)<-c("iteration","likelihood",rownames(networkdata_results))
 

		#Now add the new sets of results to the previous ones if kkk>1 or set up the results store if kkk=1

		if(kkk==1) {

			lastyearstore<-lastyear

			parameterstore<-data.frame(Results_to_store[1])
			for(jjjj in 2:length(Results_to_store)) {
				parameterstore[,jjjj]<-Results_to_store[jjjj]
			}
			names(parameterstore)<-names(Results_to_store)
			rownames(parameterstore)<-NULL

			optresultsstore<-data.frame(opt_results_to_store[1])
			for(jjjj in 2:length(opt_results_to_store)) {
				optresultsstore[,jjjj]<-opt_results_to_store[jjjj]
			}
			names(optresultsstore)<-names(opt_results_to_store)
			rownames(optresultsstore)<-NULL

			monthlystore<-monthlyfinal

			inshoreaamassstore<-data.frame(inshoreaamass[1])
			for(jjjj in 2:length(inshoreaamass)) {
				inshoreaamassstore[,jjjj]<-inshoreaamass[jjjj]
			}
			names(inshoreaamassstore)<-names(inshoreaamass)
			rownames(inshoreaamassstore)<-NULL

			offshoreaamassstore<-data.frame(offshoreaamass[1])
			for(jjjj in 2:length(offshoreaamass)) {
				offshoreaamassstore[,jjjj]<-offshoreaamass[jjjj]
			}
			names(offshoreaamassstore)<-names(offshoreaamass)
			rownames(offshoreaamassstore)<-NULL

			wholeaamassstore<-data.frame(wholeaamass[1])
			for(jjjj in 2:length(wholeaamass)) {
				wholeaamassstore[,jjjj]<-wholeaamass[jjjj]
			}
			names(wholeaamassstore)<-names(wholeaamass)
			rownames(wholeaamassstore)<-NULL

			inshoreannualfluxstore<-data.frame(inshore_annual_flux[1])
			for(jjjj in 2:length(inshore_annual_flux)) {
				inshoreannualfluxstore[,jjjj]<-inshore_annual_flux[jjjj]
			}
			names(inshoreannualfluxstore)<-names(inshore_annual_flux)
			rownames(inshoreannualfluxstore)<-NULL

			offshoreannualfluxstore<-data.frame(offshore_annual_flux[1])
			for(jjjj in 2:length(offshore_annual_flux)) {
				offshoreannualfluxstore[,jjjj]<-offshore_annual_flux[jjjj]
			}
			names(offshoreannualfluxstore)<-names(offshore_annual_flux)
			rownames(offshoreannualfluxstore)<-NULL

			wholeannualfluxstore<-data.frame(whole_annual_flux[1])
			for(jjjj in 2:length(whole_annual_flux)) {
				wholeannualfluxstore[,jjjj]<-whole_annual_flux[jjjj]
			}
			names(wholeannualfluxstore)<-names(whole_annual_flux)
			rownames(wholeannualfluxstore)<-NULL

			networkstore<-data.frame(networkdata[1])
			for(jjjj in 2:length(networkdata)) {
				networkstore[,jjjj]<-networkdata[jjjj]
			}
			names(networkstore)<-names(networkdata)
			rownames(networkstore)<-NULL
		}

		if(kkk>1) {
			lastyearstore<-rbind(lastyearstore,lastyear)
			parameterstore<-rbind(parameterstore,Results_to_store)
			optresultsstore<-rbind(optresultsstore,opt_results_to_store)
			monthlystore<-rbind(monthlystore,monthlyfinal)
			inshoreaamassstore<-rbind(inshoreaamassstore,inshoreaamass)
			offshoreaamassstore<-rbind(offshoreaamassstore,offshoreaamass)
			wholeaamassstore<-rbind(wholeaamassstore,wholeaamass)
			inshoreannualfluxstore<-rbind(inshoreannualfluxstore,inshore_annual_flux)
			offshoreannualfluxstore<-rbind(offshoreannualfluxstore,offshore_annual_flux)
			wholeannualfluxstore<-rbind(wholeannualfluxstore,whole_annual_flux)
			networkstore<-rbind(networkstore,networkdata)
		}



#----------------------------------------------------------------------------------------------------------

		#Plot or update the baseline and time series of sample likelihoods so far....
		if(runtime.plot==TRUE){
		if(kkk>1){

			if(kkk==2 | newglims==1){
			par(mfrow=c(1,1))
			plot(c(1,n_iter),rep(optresultsstore$likelihood[1],2),ylim=c(axmin,axmax),xlim=c(1,n_iter),xlab="Iterations",ylab="Target data likelihood",type="l",lwd=3,col="grey")
			points(optresultsstore$iteration,optresultsstore$likelihood,type="p",pch=16,col="black")
                	legend("topleft",c("baseline","samples"),bg="transparent",
				col=c("grey","black"),
				lty=c(1,0),lwd=c(3,0),
				pch=c(NA,20),
				pt.cex=c(0,1))
			}

			if(newglims==0){
			points(optresultsstore$iteration[kkk],optresultsstore$likelihood[kkk],type="p",pch=16,col="black")
			}

		}
		}

#----------------------------------------------------------------------------------------------------------

	}

	message("")
	
	# Now that we have finished with running the model (StrathE2E function- where csv.output is forced to FALSE), we
	# need to re-set it to conform with the settings of the overall MonteCarlo function
	pkg.env$csv.output <- csv.output	# controls writing of CSV files

	#Output the final accumulated output to disc
	# Of all this, the parameter store is potentially useful to be returned as an object by the function. For all the rest, they are saved to files if csv.output=TRUE
	writecsv(parameterstore,		csvname(credpath, "CredInt_cumulative_parameters",identifier),		row.names=FALSE)
	writecsv(lastyearstore,			csvname(credpath, "CredInt_cumulative_lastyear", identifier),		row.names=FALSE)
	writecsv(optresultsstore,		csvname(credpath, "CredInt_cumulative_targetresults",identifier),	row.names=FALSE)
	writecsv(monthlystore,			csvname(credpath, "CredInt_cumulative_monthly",identifier),		row.names=FALSE)
	writecsv(inshoreaamassstore,		csvname(credpath, "CredInt_cumulative_inshoreaamass",identifier),	row.names=FALSE)
	writecsv(offshoreaamassstore,		csvname(credpath, "CredInt_cumulative_offshoreaamass",identifier),	row.names=FALSE)
	writecsv(wholeaamassstore,		csvname(credpath, "CredInt_cumulative_wholeaamass",identifier),		row.names=FALSE)
	writecsv(inshoreannualfluxstore,	csvname(credpath, "CredInt_cumulative_inshoreannualflux",identifier),	row.names=FALSE)
	writecsv(offshoreannualfluxstore,	csvname(credpath, "CredInt_cumulative_offshoreannualflux",identifier),	row.names=FALSE)
	writecsv(wholeannualfluxstore,		csvname(credpath, "CredInt_cumulative_wholeannualflux",identifier),	row.names=FALSE)
	writecsv(networkstore,			csvname(credpath, "CredInt_cumulative_network",identifier),		row.names=FALSE)

	#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


	if(postprocess==TRUE){

	message("Processing raw output to generate credible interval data.... ")

	#Now process the stored outputs to create the credible intervals for all of the model outputs. All are returned as part of a list object and saved to files if csv.output=TRUE
	# csv.output is set individually for each new function call
	# plotting of the cmulative likelihood curves is hard-turned-off within each of these functions. Plotting is useful for diagnostics but for production runs
	# its just a distraction slows down the process a little bit.

	message("Daily mass outputs ...")
	CI_daily <- CredInt_make_daily_results(model, lastyearstore, csv.output=csv.output)

	message("Daily flux outputs ...")
	CI_dflux <- CredInt_make_daily_flux_results(model, lastyearstore, csv.output=csv.output)

	message("Migration outputs ...")
	CI_migr  <- CredInt_make_daily_migration_results(model, lastyearstore, csv.output=csv.output)

	message("Monthly outputs ...")
	CI_month <- CredInt_make_monthly_results(model, monthlystore, csv.output=csv.output)

	message("Annual average mass outputs ...")
	CI_aamass<- CredInt_make_aamass_results(model, inshoreaamassstore, offshoreaamassstore, wholeaamassstore, csv.output=csv.output)

	message("Annual flux outputs ...")
	CI_aflux <- CredInt_make_annual_flux_results(model, inshoreannualfluxstore, offshoreannualfluxstore, wholeannualfluxstore, csv.output=csv.output)

	message("Network index outputs ...")
	CI_netw  <- CredInt_make_network_results(model, networkstore, csv.output=csv.output)

	message("Target data outputs ...")
	CI_targ  <- CredInt_make_target_results(model, optresultsstore, csv.output=csv.output)

	message("Parameter value outputs ...")
	CI_parms <- CredInt_make_parameter_results(model, parameterstore, csv.output=csv.output)

	#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	output_data <- list(parameter_record = parameterstore,
			    CI_daily_results = CI_daily,
			    CI_monthly_results = CI_month,
			    CI_annual_avmass = CI_aamass,
			    CI_daily_fluxes = CI_dflux,
			    CI_annual_fluxes = CI_aflux,
			    CI_migration_fluxes = CI_migr,
			    CI_network_indices = CI_netw,
			    CI_target_data  = CI_targ,
			    CI_parameter_values = CI_parms)

	message("Processing completed")

	}  # end of if postprocess=TRUE statement

	if(postprocess==FALSE){
	output_data <- NULL
	}

	return(output_data)
}

Try the StrathE2E2 package in your browser

Any scripts or data that you put into this service are public.

StrathE2E2 documentation built on Jan. 23, 2021, 1:07 a.m.