knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

After all the input data has been processed, they need to be harmonized and combined into a single file thatis used as input for one of the downscaling algorithms. The sections below briefly describe the main functions in mapspamc to do this. Almost all functions require only one input param, an object with the mapspamc parameters. Note that 'under the hood' of these functions a lot of other processes are triggered, which automatically load the data that was created in previous steps, perform consistency checks, reformat data from spatial to data table format and, where needed, run algorithms to harmonize the various inputs. This means that some of the functions might take some time to run, in particular if the resolution is set to 30 arc second, which considerably increases the size of the model. All the functions send a message to the screen when they have finished so the user knows what is happening.

All the intermediate data output is saved in the processed_data/intermediate_output folder. In case the user has set solve_level = 1, the various functions split the data into administrative level 1 chunks, which are saved in subfolders using the level 1 administrative unit code as name of the folder. If solve_level = 0, only one subfolder, with the country's iso3c code as name, will be created.

Prepare physical area

prepare_physical_area() combines the three agricultural statistics input files (harvested area, production system shares and cropping intensity) to calculate the physical cropping area for all administrative units.

prepare_physical_area(param)

Prepare cropland

The function prepare_cropland() combines the three synergy cropland components (medium and maximum cropland and cropland ranking maps) into a data table and stores this in a file.

prepare_cropland(param)

Prepare irrigated area

prepare_irrigated_area() is similar to the function that prepares the cropland as it combines the synergy irrigated area maps (maximum irrigated area and ranking maps) into one file.

prepare_irrigated_area(param)

Harmonize inputs

Before the algorithms in mapspamc can be solved, it is essential to harmonize the physical are information, which is derived from national and subnational statistics, with the synergy cropland and irrigated area maps, which are based on remote sensing information and other spatially explicit data sources. As this data is coming from different sources, they will are not always fully consistent. This would not be a problem if the cropland exent would be larger than the physical crop area, meaning there would be enough space to allocate the statistics on the cropland map. Similarly if the total irrigated area in the irrigated area map would be larger than the physical area of the irrigated production systems the data would fit on the map. Unfortunately, often this is not the case and, without adjustments, the downscaling algorithms would be impossible to solve. In practice, we use 'slack variables' to ensure the model always solves (see Appendix). However, large slacks in the solution signal serious inconsistencies and therefore we check for inconsistencies and adjust the data already in the model preparation stage.

harmonize_inputs() uses a number of steps to harmonize the various data sources:

In the end, the user has to decide if the slacks are acceptable or not. In our opinion small slacks (measured as share of total or administrative unit physical crop area) are no problem to deal with inconsistencies. However it slacks become very large we recommend scrutinizing the statistics and where possible make adjustments. Large slack often results from data entry errors or too rigid cropping intensity values. We provide some advise on how to deal with slack in the Appendix.

harmonize_inputs(param)

Prepare priors and scores

prepare_priors_and_scores() creates the priors and the scores for each grid cell. For convenience, the function will always create data tables with priors and scores even though only one is needed because the user only wants to run min_entropy, which requires the priors, or max_score, which requires the scores. In this way, the user can easily test different algorithms, without going through the data pre-processing steps.

Note that the function might take some time to run as it implements three consecutive processes. First, the biophysical suitability and potential yield maps for all production system and crop combinations are loaded and only grid cells that overlap with the cropland extent from the previous step are selected, after which all data is merged into one table and saved. This process also checks if the maps do not only contain zero values and, where needed, replaces the map by a substitute crop. This is important because it occasionally happens that the biophysical suitability and potential yield maps indicate zero suitability for a specific crop although the statistics suggest the crop is produced in the country. If we would not correct for this, most scores and priors for this crop would be zero, resulting in an 'uninformed' allocation of the crop, meaning it can be placed anywhere as long as the the constraints are satisfied and the objective function (minimization of cross-entropy or maximization of fitness score) is optimized. In case all the substitute crops have zero values, a warning is issued. We prepared a list of substitute crops that is stored in the mappings/replace_gaez.sv file. You can modify the list to add other substitute crops if you think these are more appropriate. The only requirement is that selected crop must be in the list of SPAM crops that is stored in mappings/crop.csv. The second and third process create data files with the priors and scores using the biophysical suitability and potential yield, among others, as input data.

prepare_priors_and_scores(param)

Combine inputs

Finally, all the inputs, including the harmonized cropland extent, irrigated area extent and statistics, and the priors/scores are combined in one GAMS gdx file, which is used as input to solve the downscaling algorithm in GAMS. The file contains a number of sets and parameter tables that define the model. Sets describe the dimensions of the model, while parameters contain the data along these dimensions. As part of the process to combine all the inputs, and if relevant, artificial administrative units are created that represent the combination of all administrative units per crop for which subnational statistics are missing. These units are added to the list of administrative units from the subnational statistics. The names of these units, stored in the adm_area parameter table, start with the name of the lower level administrative unit which nests the units with missing data, followed by ART and the level for which data is missing and ending with the crop for which data is not available.

combine_inputs(param)


michielvandijk/mapspamc documentation built on April 17, 2025, 7:31 p.m.