\linenumbers \doublespacing

library(knitr)
opts_chunk$set(echo = TRUE,include=TRUE,eval=FALSE)
oldSource <- knit_hooks$get("source")
knit_hooks$set(source = function(x, options) {
  x <- oldSource(x, options)
  x <- ifelse(!is.null(options$ref), paste0("\\label{", options$ref,"}", x), x)
  x <- ifelse(!is.null(options$code.cap), paste0("\\caption{", options$code.cap,"}", x), x)
  x <- paste0(raw_latex('\\begin{codechunk}'),
         x,
         raw_latex("\\end{codechunk}"))
  return(x)

})

Introduction

The Decision Support System for Agrotechnology Transfer Cropping Systems Model [DSSAT-CSM; @jones2003] is a widely used crop modeling system with an estimated 2,500 users across 100 countries worldwide [@Koo2016]. The standard user training demonstrates the use of the DSSAT Shell, a Windows-based graphical user interface (GUI) with various utilities for preparing input files, running simulations, and summarizing output. This interface greatly improves the accessibility of the DSSAT-CSM for beginning users. However, most advanced users of DSSAT-CSM have developed ad hoc scripts in various languages/software environments (e.g. R, Python, SAS) to automate various stages in their analysis (J.W. White, personal communication, October 17, 2019). While an ad hoc approach may be sufficient for many applications, a coherent framework for developing modeling workflows would improve the transparency, reproducibility, and productivity of crop modeling research. With such a framework, researchers would save time in the immediate term in the generation and processing of files and in the long term by making their scripts easier to understand by others and themselves retrospectively.

There have been several attempts to provide frameworks for building crop modeling workflows. For example, the apsimr package [@apsimr] was developed as an interface to the Agricultural Production Systems sIMulator [APSIM; @Holzworth2014], a crop modeling system similar to DSSAT-CSM. The package includes functions to create, edit, and run APSIM simulations and analyze outputs from R. Similarly, pyDSSAT is a package that was developed to facilitate the use of DSSAT-CSM within a Python workflow [@pyDSSAT]. The package provides command line interface and GUI tools for manipulating simulation batch files and crop management input files, running DSSAT-CSM simulations and analyzing model outputs. Likewise, jDSSAT [@jDSSAT] is a JavaScript Module that was developed to eventually replace the existing DSSAT Shell. The long-term goal of the effort is to provide support for all input and output types for DSSAT-CSM, but its current implementation has capabilities similar to those of pyDSSAT. Within the R ecosystem, the Dasst package [@Dasst] provides tools to simplify the post-processing of output files for DSSAT-CSM, although no tools are provided for manipulating input files or running the model. While each of the above examples are valuable contributions, all are limited in scope and none constitute a generic framework for developing full crop modeling workflows with DSSAT-CSM. A generic framework would need to provide capabilities for manipulating the full range of required model inputs (e.g. crop management, soil properties, weather data, and genotype-specific parameters), running simulations and post-processing model outputs. Furthermore, none of these frameworks leverage the tidyverse, a set of R packages developed to enhance transparency and reproducibility of analysis based on a common design philosophy, grammar, and set of data structures [@tidyverseManifesto]. The DSSAT R package was developed to provide tools consistent with tidyverse principles that would facilitate preparing required model inputs, executing simulations, and processing and analyzing outputs for DSSAT-CSM. This application note demonstrates the use of the new DSSAT R package for building reproducible crop modeling workflows with DSSAT-CSM.

Installing and loading the DSSAT package {#install_load_sec}

The DSSAT package source code is hosted in an open-source project on Github (\url{https://github.com/palderman/DSSAT}). Source Code \ref{install_load} provides example R code for installing the DSSAT package from either the Github source code repository using the install_github() function or from the Comprehensive R Archive Network (CRAN; \url{https://cran.r-project.org}) using the install.packages() function. Installing from the Github source code repository requires installation of the devtools R package [@devtools]. Either option should install DSSAT and any required dependencies not already installed on the system. Once package installation is complete, the package can be loaded using the library() function (Source Code \ref{install_load}). Full use of the DSSAT package requires an installation of DSSAT-CSM, which can be obtained from the DSSAT Foundation (https://dssat.net/) or by compiling it from source code (https://github.com/DSSAT/dssat-csm-os). When the DSSAT package is loaded, it attempts to locate the local DSSAT-CSM installation and identify the proper executable name. It then prints a start up message indicating what file path was found (if any) and prompting the user to reset the path (by setting the value of the DSSAT.CSM option variable) if the located file path is incorrect. Source Code \ref{install_load} shows two examples for setting the file path to the DSSAT-CSM executable. The first is an example path for a Windows installation. The second example is compatible with a Unix-style operating system (e.g. macOS, Linux, etc). The following sections illustrate use of the most important functions available in the package. However, a complete list of functions can be found in the reference manual on CRAN (\url{https://cran.r-project.org/package=DSSAT}).

# Install DSSAT package from Github source using devtools package
devtools::install_github('https://github.com/palderman/DSSAT')
# Install DSSAT package from CRAN
install.packages('DSSAT')
# Load the DSSAT package
library(DSSAT)
# Example setting DSSAT-CSM path for Windows operating system
options(DSSAT.CSM = 'C:\\DSSAT47\\DSCSM047.EXE')
# Example setting DSSAT-CSM path for Unix-style operating system
options(DSSAT.CSM = '/DSSAT47/dscsm047')

Modifying DSSAT files

library(DSSAT)
library(tidyverse)

The DSSAT package implements a set of functions for reading and writing standard DSSAT file formats including files for cultivar (*.CUL), ecotype (*.ECO), soil (*.SOL), weather (*.WTH), experiment details (FileX), seasonal observed data (FileA), and time-series observed data (FileT). As an example, the function read_sol() reads soil profiles from the standard DSSAT soil file (*.SOL) format. Source Code \ref{modify_sol} shows the use of this function within an example workflow that creates a new soil profile from an existing one and appends it to an existing soil file. The first statement reads the entire contents of the soil file SOIL.SOL, while the second statement reads only the profile identified by the code IB00000001. The output of this function is an object of class DSSAT_tbl, which is an extension of the tibble[itself an extended version of the basic data frame with enhanced functionality defined in the tibble package; @RforDataScience; @tibble], with additional attributes used internally to store information about the original format of the file from which the data came. Some of the original data are converted into list-columns due to the one-to-many relationship between whole-profile and layer-specific data. For example, properties such as albedo (SALB) or runoff curve number (SLRO) have a single value for each profile, but other properties, such as saturation volumetric soil water (SSAT) or bulk density (SBDM), have values for each individual layer within the profile. Storing the layer-specific data as list-columns in the output from read_sol() facilitates reading and combining multiple soil profiles into a single combined tibble.

As an example, suppose one wanted to calculate a new value for SSAT as 95% of pore space estimated from SBDM. One could perform this calculation and replace the former values using the third statement in Source Code \ref{modify_sol}. For readers unfamiliar with the tidyverse-style of R programming [http://tidyverse.org; @tidyverseManual], this example uses the %>% pipe operator to pass the output from one line to the first argument of the function on the following line. Thus, the single_profile tibble is passed to mutate(), in which the PEDON column is assigned the code IBNEW00001 and SSAT column is assigned the new values calculated from SBDM. This example and all following examples presuppose that the tidyverse package has been loaded using the library() function. An alternative formulation that does not use tidyverse-style coding is provided just below (Source Code \ref{modify_sol}). Once these changes have been made, the new profile can be appended to the existing SOIL.SOL by calling the function write_sol() with the append argument set to TRUE (the default value), as shown in the fifth statement in Source Code \ref{modify_sol}. The write_sol() can also be used to write a new soil file or overwrite an existing soil file by setting append to FALSE. Thus, care should be taken to avoid unintentional loss of data.

# Reading all profiles in a file
all_profiles <- read_sol('SOIL.SOL')
# Reading a single profile
single_profile <- read_sol('SOIL.SOL',id_soil = 'IB00000001')
# Renaming the profile and replacing SSAT with new values
#     calculated from SBDM using tidyverse-style coding
new_profile <- single_profile %>% 
  mutate(PEDON='IBNEW00001',
         SSAT=0.95*(2.65-SBDM)/2.65)
# Renaming the profile and replacing SSAT with new values
#     calculated from SBDM without using tidyverse-style coding
new_profile <- single_profile
new_profile$PEDON[1] <- 'IBNEW00001'
new_profile$SSAT[[1]] <- 0.95*(2.65-single_profile$SBDM[[1]])/2.65
# Appending new profile to SOIL.SOL
write_sol(new_profile,'SOIL.SOL',append=TRUE)

Weather data can also be imported into R in a similar way using the read_wth() function. The output of this function is a tibble containing the daily weather data from the DSSAT format weather file (*.WTH). The tibble also contains an attribute called GENERAL in which the general information about the site is stored including, among other details, the long-term average temperature (TAV) and monthly temperature amplitude (AMP). Supposing one had a directory of weather files from multiple years at the same location that were missing the TAV and AMP values, one could calculate these values from the daily data, assign them to the GENERAL attribute for each year, and then re-write the weather data with the new TAV and AMP values. An example workflow for this process is provided in Source Code \ref{modify_wth}. Variations of this workflow could be used to modify values within daily weather data as well to fill missing-data gaps or combine variables from different data sources.

# Generate a list of the weather files
wth_file_list <- list.files(pattern='.WTH')
# Read all weather files into a list of tibbles
all_wth <- wth_file_list %>% 
  map(read_wth)
# Combine all years into a single tibble for summary calculations
combined_wth <- all_wth %>% 
  bind_rows()
# Calculate long-term average temperature (TAV)
tav <- combined_wth %>% 
  summarize(TAV=mean((TMAX+TMIN)/2))
# Calculate monthly temperature amplitude (AMP)
amp <- combined_wth %>% 
  # Extract month from DATE column
  mutate(month = month(DATE)) %>% 
  # Group data by month
  group_by(month) %>% 
  # Calculate monthly means
  summarize(monthly_avg = mean((TMAX+TMIN)/2)) %>% 
  # Calculate AMP as half the difference between minimum and
  #     maximum monthly temperature
  summarize(AMP = (max(monthly_avg)-min(monthly_avg))/2)
# Generate new general information table
general_new <- all_wth[[1]] %>% # use first year as template
  # Extract GENERAL table
  attr('GENERAL') %>% 
  # Replace TAV and AMP with new values
  mutate(TAV=tav$TAV,
         AMP=amp$AMP)
# Store new general information table within each year
for(i in 1:length(all_wth)){
  # Replace general information table
  attr(all_wth[[i]],'GENERAL') <- general_new
}
# Overwrite previous weather files with modified weather data
for(i in 1:length(all_wth)){
  # Write weather file i
  write_wth(all_wth[[i]],wth_file_list[i])
}

The experiment details file format (FileX) is one of the most complex of the DSSAT file formats because it contains a tree-like structure with multiple tables of data that are connected by a combination of one-to-one and one-to-many relationships. At present, no attempt has been made within the DSSAT package to construct a unified relational data structure. Thus, the output of the read_filex() function is a named list of tibbles each element of which corresponds to a section of the FileX. The names of the list correspond to the section names of the FileX. An example workflow for adding an additional irrigation event to the IRRIGATION AND WATER MANAGEMENT section of a FileX is given in Source Code \ref{modify_filex}. The function read_filex() works similarly to the other read_*() functions already discussed. In the second statement, a conditional mutate function mutate_cond() (provided by the DSSAT package) is used to modify only rows that meet the conditions provided in the second argument. In this case, only rows where I equals 1 will be modified. Due to the one-to-many relationship between irrigation level (I) and the application details (IDATE, IROP, and IRVAL), these details are stored as list-columns, hence the data for the new event must be appended using the concatenate function c(). The final statement in Source Code \ref{modify_filex} uses write_filex() to write out the modified experiment details using the same file name as the original file. By using the same name the original file will be replaced by the new file. If this behavior is not desired, a different file name for the FileX may be provided.

# Read in original FileX
file_x <- read_filex('KSAS8101.WHX')
# Add an additional 60 mm irrigation event on 4 May 1982
file_x$`IRRIGATION AND WATER MANAGEMENT` <- 
  # Extract the original IRRIGATION AND WATER MANAGEMENT section
  file_x$`IRRIGATION AND WATER MANAGEMENT` %>% 
  # Modify the IDATE, IROP, and IRVAL columns only where I equals 1
  mutate_cond(I==1,
              IDATE = c(IDATE,as.POSIXct('1982-05-04')),
              IROP  = c(IROP,"IR001"),
              IRVAL = c(IRVAL,60))
# Overwrite original FileX with new values
write_filex(file_x,'KSAS8101.WHX')

Although space considerations preclude providing examples for all file types, similar workflows could be constructed for other file types using the corresponding functions for reading/writing files for cultivar (read_cul() and write_cul()), ecotype (read_eco() and write_eco()), FileA (read_filea() and write_filea()), and FileT (read_filet() and write_filet()).

Running simulations and summarizing output

In addition to modifying input files, the DSSAT package also contains functions for generating simulation batch files, running the model, and reading simulated output. Once the option variable DSSAT.CSM has been set (see Section \ref{install_load_sec}), the user can generate a simulation batch file as illustrated in the third and fourth statements in Source Code \ref{run_dssat}. In the third statement, the user constructs a data frame or tibble with all the necessary columns specified including, among other details, the FileX name and treatment levels to be run. In the fourth statement, the user specifies as many of the columns as are needed to uniquely specify the set of simulations and the remaining columns will be filled with default values. If the file_name argument is not specified, the function will attempt to construct a file name based on the current value of DSSAT.CSM. Once the batch file has been generated, the model can be run using the run_dssat() function. Once simulations have completed, the simulated output can be read using the read_output() function as is demonstrated in Source Code \ref{run_dssat}.

# Generate a DSSAT batch file using a tibble
tibble(FILEX='KSAS8101.WHX', TRTNO=1:6, RP=1, SQ=0, OP=0, CO=0) %>% 
  write_dssbatch()
# Generate a DSSAT batch file with function arguments
write_dssbatch(filex='KSAS8101.WHX',trtno=1:6)
# Run DSSAT-CSM
run_dssat()
# Read seasonal output file
smry <- read_output('Summary.OUT')

The read_output() function can also be used to read daily simulated output and generate publication-quality graphics when combined with functions from the ggplot2 package [@ggplot2] as shown in Source Code \ref{visualize_output}. The first statement reads in the simulated output, converts treatment number (TRNO) to a discrete factor, and filters the output to include only treatments 4 to 6. The second statement reads in observed data from FileT format and subsets to the corresponding treatments. The final statement builds a publication-quality plot using the simulated and observed datasets, the output of which is shown in Figure \ref{fig:visualization}. Further explanation of the functions used to construct the plot can be found in the ggplot2 documentation [@ggplot2].

```r showing observed (points) and simulated (lines) leaf area index over time for 0, 60, and 180 kg N ha^-1^ fertilization rates.",message=FALSE,warning=FALSE,eval=TRUE}

Read daily simulated plant growth output

pgro <- read_output('PlantGro.OUT') %>% # Filter to treatments 4 to 6 filter(TRNO %in% 4:6) %>% # Convert TRNO to a factor and rename to Fertilization Rate mutate(Fertilization Rate=factor(TRNO,labels=c(0,60,180)))

Read time-series observed plant growth data from FileT

filet <- read_filet('KSAS8101.WHT') %>% # Filter to treatments 4 to 6 filter(TRNO %in% 4:6) %>% # Convert TRNO to a factor and rename to Fertilization Rate mutate(Fertilization Rate=factor(TRNO,labels=c(0,60,180))) %>% # Add days after planting (DAP) to observed data left_join(select(pgro,DATE,DAP))

Construct a combined plot with simulated and observed data

ggplot(data=pgro,aes(x=DAP,y=LAID,linetype=Fertilization Rate))+ # Add a line plot for simulated data geom_line()+ # Add observed data as points geom_point(data=filet,aes(shape=Fertilization Rate))+ # Add a custom y-axis label with units ylab(expression(Leaf~Area~Index~"("m^2~m^{-2}")"))+ # Add a custom x-axis label xlab("Days After Planting")+ # Set color theme to black and white theme_bw()+ # Reposition legend theme(legend.position=c(0.15,0.8)) ```

Summary and Future Directions

In summary, the DSSAT R package provides basic functions for reading and writing input files, executing simulations, and reading simulated output files for DSSAT-CSM. These functions can be combined with other R packages to develop robust, reproducible, scientific modeling workflows. The current version of the package provides a foundation for further development of higher-level functionality such as conducting automated sensitivity analysis and parameter estimation, filling gaps in weather data, and estimating soil parameters from pedotransfer functions. Future developments for the package might also include improving the interface for manipulating FileXs, speeding up read and write operations and extending capabilities to include reading and writing species parameter files.

\clearpage

Acknowledgements

This work was supported by the USDA National Institute of Food and Agriculture, Hatch project OKL03023.

References {#references .unnumbered}



palderman/DSSAT documentation built on Feb. 24, 2024, 4:38 a.m.