README.md

The soilc.ipcc package

Dr Alasdair Sykes 03/01/2021

Soil carbon modelling in a tidyverse environment

Soil carbon is currently one of the most studied and debated topics in the field of agriculture and climate change. Like much which occurs on the boundary of human management and biophysical systems, it’s also extremely challenging to measure or model accurately. To get practical answers to land managers, academics typically need to rely either on simple, empirically calibrated methods, or on highly complex process-based models. Both methods have drawbacks; the former is imprecise, the latter time-consuming, and both are difficult to extrapolate. Finding some middle ground, the Intergovernmental Panel on Climate Change recently (June 2019) released a simplified, globally calibrated three-pool process-based soil carbon model, along with accompanying empirical methods to estimate the required input parameters. The full details of this model are given on the homepage of the 2019 Refinement to the 2006 IPCC Guidelines for National Greenhouse Gas Inventories.

This package ties all of this information together in an easy-to-use, flexible model implementation which works from start to end of the modelling process — beginning with management information, ending with a soil carbon stock estimate — and which can be run as a seamless part of a tidyverse-based workflow.

This package was imagined and written by Dr Alasdair Sykes (@alasdairsykes). If you use this package in your work, please cite it as:

Sykes, A. J. (2020) soilc.ipcc: An R implementation of the 2019 IPCC Tier 2 steady-state method for soil carbon stock estimation in cropland. Version 1.1.2. Available at https://github.com/aj-sykes92/soilc.ipcc.

DOI

Disclaimer: This package is entirely the creation of its author, and no warranty is given or implied as to its use, or to its fidelity to the modelling framework it represents (with that said, my intention is to reproduce this framework faithfully, so if you notice any errors or omissions, please let me know)!

Installation

This package is hosted on Github and can be installed using the devtools package:

# install.packages("devtools")
devtools::install_github("aj-sykes92/soilc.ipcc")

Note: If this doesn’t work, update devtools, as Github recently changed the term it uses for the primary version of a source code repository.

Usage — run_model

The development process for package aimed to tread the line between making things ‘just work’ and giving the accomplished user enough flexibility to use and modify any part of the model independently. In the spirit of doing the fun stuff first, this section will demonstrate the run_model function — this is a complete, out-of-the-box soil carbon model which will give quick and painless carbon stock change estimates when passed data of the right kind.

To help out here, the package also contains a small dataset of exactly the right kind, called toy_input:

glimpse(toy_input)
## Rows: 10
## Columns: 7
## $ year        <int> 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020
## $ climdata    <list> [<tbl_df[12 x 3]>, <tbl_df[12 x 3]>, <tbl_df[12 x 3]>, <…
## $ c_input     <dbl> 5.298971, 4.945811, 5.326737, 4.754702, 4.488324, 4.85921…
## $ lignin_frac <dbl> 0.005905677, 0.006083535, 0.006065450, 0.005974941, 0.005…
## $ n_frac      <dbl> 0.05013271, 0.04963280, 0.05039613, 0.05054946, 0.0500886…
## $ sand_frac   <dbl> 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43
## $ till_type   <chr> "full", "full", "full", "full", "full", "full", "full", "…

It’s an ordinary tibble, with one list column; this contains a list of nested climate data, each of dimensions 3 * 12:

glimpse(toy_input$climdata[[1]])
## Rows: 12
## Columns: 3
## $ temp   <dbl> 14.10027, 17.39389, 17.59273, 15.41833, 15.20850, 16.24473, 17…
## $ precip <dbl> 53.39583, 58.48306, 55.08518, 57.16217, 53.38465, 52.36533, 55…
## $ pet    <dbl> 46.52958, 43.81984, 42.76176, 44.95919, 44.06107, 40.87934, 44…

Passed a dataset of this type, the run_model function will output results with no further input required:

toy_input %>%
  run_model() %>%
  glimpse()
## Rows: 10
## Columns: 14
## $ year           <int> 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, …
## $ c_input        <dbl> 5.298971, 4.945811, 5.326737, 4.754702, 4.488324, 4.85…
## $ lignin_frac    <dbl> 0.005905677, 0.006083535, 0.006065450, 0.005974941, 0.…
## $ n_frac         <dbl> 0.05013271, 0.04963280, 0.05039613, 0.05054946, 0.0500…
## $ sand_frac      <dbl> 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, …
## $ till_type      <chr> "full", "full", "full", "full", "full", "full", "full"…
## $ tfac           <dbl> 0.5189817, 0.4939976, 0.4878877, 0.4758409, 0.4754930,…
## $ wfac           <dbl> 2.223240, 2.175849, 2.226038, 2.194289, 2.171587, 2.20…
## $ climdata       <list> [<tbl_df[12 x 3]>, <tbl_df[12 x 3]>, <tbl_df[12 x 3]>…
## $ active_y       <dbl> 0.1820583, 0.1823888, 0.1944128, 0.1805120, 0.1723055,…
## $ slow_y         <dbl> 1.694165, 1.692363, 1.768588, 1.705763, 1.634969, 1.69…
## $ passive_y      <dbl> 37.39614, 37.39293, 37.40793, 37.40198, 37.38430, 37.3…
## $ total_y        <dbl> 39.27237, 39.26768, 39.37093, 39.28825, 39.19157, 39.2…
## $ c_stock_change <dbl> -0.023740126, -0.004686108, 0.103255479, -0.082681955,…

A great deal of intermediate calculations are generated and dropped when run_model executes with its default arguments; for more information on the model run process, and a better ability to interrogate the results, it’s best to alter the default arguments a bit:

toy_input %>%
  run_model(drop_prelim = FALSE, drop_runin = FALSE) %>%
  glimpse()
## Rows: 11
## Columns: 22
## $ year           <int> NA, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 20…
## $ c_input        <dbl> 4.958704, 5.298971, 4.945811, 5.326737, 4.754702, 4.48…
## $ lignin_frac    <dbl> 0.006005288, 0.005905677, 0.006083535, 0.006065450, 0.…
## $ n_frac         <dbl> 0.05024811, 0.05013271, 0.04963280, 0.05039613, 0.0505…
## $ sand_frac      <dbl> 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, …
## $ till_type      <chr> "full", "full", "full", "full", "full", "full", "full"…
## $ tfac           <dbl> 0.4850891, 0.5189817, 0.4939976, 0.4878877, 0.4758409,…
## $ wfac           <dbl> 2.195574, 2.223240, 2.175849, 2.226038, 2.194289, 2.17…
## $ climdata       <list> [NULL, <tbl_df[12 x 3]>, <tbl_df[12 x 3]>, <tbl_df[12…
## $ tillfac        <dbl> 3.036, 3.036, 3.036, 3.036, 3.036, 3.036, 3.036, 3.036…
## $ alpha          <dbl> 2.528185, 2.701829, 2.521510, 2.715727, 2.424212, 2.28…
## $ k_a            <dbl> 13.69867, 14.84046, 13.82492, 13.96887, 13.42965, 13.2…
## $ k_s            <dbl> 0.6757991, 0.7321271, 0.6820271, 0.6891289, 0.6625271,…
## $ k_p            <dbl> 0.007338187, 0.007949827, 0.007405814, 0.007482929, 0.…
## $ active_y_ss    <dbl> 0.1845569, 0.1820583, 0.1823888, 0.1944128, 0.1805120,…
## $ slow_y_ss      <dbl> 1.711369, 1.687870, 1.691523, 1.802973, 1.673761, 1.59…
## $ passive_y_ss   <dbl> 37.40018, 36.89231, 36.96202, 39.39844, 36.58002, 34.9…
## $ active_y       <dbl> 0.1845569, 0.1820583, 0.1823888, 0.1944128, 0.1805120,…
## $ slow_y         <dbl> 1.711369, 1.694165, 1.692363, 1.768588, 1.705763, 1.63…
## $ passive_y      <dbl> 37.40018, 37.39614, 37.39293, 37.40793, 37.40198, 37.3…
## $ total_y        <dbl> 39.29611, 39.27237, 39.26768, 39.37093, 39.28825, 39.1…
## $ c_stock_change <dbl> 0.000000000, -0.023740126, -0.004686108, 0.103255479, …

You can also choose whether or not to deal with the nested climate data inside or outside of the run_model function. This is useful primarily because the role of climate data varies with model usage — where it might be important to have annually-specific climate datasets for one use-case, an annual average will do for another. This climate data is a multiplying factor for file size, so it’s useful to not be forced to over-inflate a file through repetitions. Here’s how this works:

# with automated climfac calculation
x <- run_model(toy_input) # default calculate_climfacs = TRUE

# with manual climfac calculation
y <- toy_input %>%
  mutate(
    tfac = map_dbl(climdata, ~tfac(.x$temp)),
    wfac = map_dbl(climdata, ~wfac(.x$precip, .x$pet))
    ) %>%
  run_model(calculate_climfacs = FALSE)

Here, x and y are identical datasets. Obviously, the x method is more efficient if you need every climdata object to be different, by the y method separates this operation from run_model in case you want to calculate tfac and wfac first, and repeat them over multiple datasets (say, perhaps, 10 different treatments in the same study site) before running the model.

The run_model function also contains default arguments naming each of the individual parameters it requires from the data object — it’s anticipated these will be ignored 99.9% of the time, but they provide flexibility so you can modify the function behaviour rather than changing your data if you so desire.

Usage — the input data setup functions

The toy_input dataset we’ve been working with thus far contains exactly what the soil carbon model needs to know in order to make an estimate of soil C stocks; however, the type of information contained (e.g. carbon inputs, lignin fractions; run ?toy_input for full descriptions of all the variables) is rarely readily available to most land managers. To help set up the model, this package also contains functions for processing the kind of information which a land manager is likely to know (e.g. crop yields, manure application rates) and converts it into an input that the model expects.

To make this happen, the package has a two-tiered set of functions. The highest tier contains just one function, build_soil_input — this function can take as many arguments as you care to pass it, and it expects the output of a set of second-tier helper functions to form these inputs. These functions are add_crop and add_manure. We will demonstrate these first:

crop1 <- add_crop(crop = "wheat", yield_tha = rep(7.2, 10), frac_remove = 0.3, frac_renew = 1)
manure1 <- add_manure(livestock_type = "beef_cattle", n_rate = rep(100, 10))

str(crop1)
## List of 4
##  $ crop       : chr "wheat"
##  $ yield_tha  : num [1:10] 7.2 7.2 7.2 7.2 7.2 7.2 7.2 7.2 7.2 7.2
##  $ frac_remove: num 0.3
##  $ frac_renew : num 1
##  - attr(*, "class")= chr [1:2] "crop" "list"
str(manure1)
## List of 2
##  $ livestock_type: chr "beef_cattle"
##  $ n_rate        : num [1:10] 100 100 100 100 100 100 100 100 100 100
##  - attr(*, "class")= chr [1:2] "manure" "list"

These list objects are admittedly not very human-readable, but they aren’t supposed to be. When passed to build_soil_input, the function maps an S3 method across the arguments list, and uses the information contained to make an estimate of the key parameters:

build_soil_input(crop1, manure1) %>%
  glimpse()
## Rows: 10
## Columns: 6
## $ om_input     <dbl> 15.71912, 15.71912, 15.71912, 15.71912, 15.71912, 15.719…
## $ c_input      <dbl> 6.675944, 6.675944, 6.675944, 6.675944, 6.675944, 6.6759…
## $ n_input      <dbl> 0.1790305, 0.1790305, 0.1790305, 0.1790305, 0.1790305, 0…
## $ lignin_input <dbl> 0.9939831, 0.9939831, 0.9939831, 0.9939831, 0.9939831, 0…
## $ n_frac       <dbl> 0.01138935, 0.01138935, 0.01138935, 0.01138935, 0.011389…
## $ lignin_frac  <dbl> 0.063234, 0.063234, 0.063234, 0.063234, 0.063234, 0.0632…

The output tibble is of correct format to provide the organic matter elements to a run_model input dataframe. Multiple manure applications — or complex management strategies like cover cropping — can be absorbed since build_soil_input has no expectations as to the number of arguments. Note also that the single-value (scalar) elements specified in add_crop and add_manure are recycled, and the output is 10 rows long, the same as the vector elements. If you pass vectors of different > 1 lengths to build_soil_input, it will pad the end of the shorter vectors with zeroes to enable it to continue, and warn you that you’ve probably/possibly made an error in your inputs.

Usage — levelling up to multiple model runs

Usually when you’re running a model of this type, you want to do it more than once. Whether it’s multiple years, multiple climate scenarios, multiple grid cells or multiple management scenarios, it’s very unlikely you’re going to be one-and-done with your modelling. With this in mind, this package has been designed to work as well as possible with tidyverse’s purrr, and the list-wise and nested-tibble operations this allows. All the functions — and particularly run_model — work well with lists and it is anticipated they will be used in this context.

This vignette is not intended to be a tutorial on purrr (go here for that), so the best way to demonstrate is by example. The header plot for this vignette was built with the toy_input used to demonstrate the package, along with some random sampling and iteration with purrr.

# function build 100-row dataset by replicating toy_input 10x
build_fake_data <- function(data) {
  map(1:10, ~data) %>%
    bind_rows() %>%
    mutate(year = 2001:2100,
           mod_fac = seq(from = 1, to = runif(1, 0.5, 2), length.out = 100) + rnorm(100, 0, 0.02),
           c_input = c_input * mod_fac,
           till_type = ifelse(year > sample(2001:2100, 1), "reduced", till_type)) %>%
    select(-mod_fac)
}

# build 10 fake datasets with random inputs
set.seed(2065)
fake_data <- map(1:10, ~build_fake_data(toy_input))

# map run model across list
output <- map(fake_data, run_model)

# plot
output %>%
  set_names(paste0("run_", 1:10)) %>%
  bind_rows(.id = "run") %>%
  ggplot(aes(x = year, y = total_y, group = run)) +
  geom_line(colour = "darkred",) +
  labs(x = "Year",
       y = expression("Soil C stocks, tonnes C ha"^{-1})) +
  theme_classic()

Package data

The toy_input data used in the above examples is just the tip of the iceberg for package data. This section covers the data utilised in package functions, which is also made available to the user for information and modification if desired.

Soil carbon model parameters

All functions which run under the wrapper function run_model, and this wrapper function itself, make use of a named list object containing model parameters. This object is passed to the final argument of each function, params. The default parameter list can be directly accessed by the user, and contains all parameters and accompanying descriptions:

head(soilc_params, 1)
## $tillfac_ft
## $tillfac_ft$be
## [1] 3.036
## 
## $tillfac_ft$sd
## [1] 0.579
## 
## $tillfac_ft$min
## [1] 1.4
## 
## $tillfac_ft$max
## [1] 4
## 
## $tillfac_ft$desc
## [1] "Tillage disturbance modifier for decay rates under full tillage"
names(soilc_params)
##  [1] "tillfac_ft" "tillfac_rt" "tillfac_nt" "wfacpar1"   "wfacpar2"  
##  [6] "wfacpar3"   "kfaca"      "k3par1"     "k3par2"     "kfacs"     
## [11] "kfacp"      "f1"         "f2"         "f2_ft"      "f2_rt"     
## [16] "f2_nt"      "f3"         "f4par1"     "f4par2"     "f5"        
## [21] "f6"         "f7"         "f8"         "ta"         "tb"        
## [26] "tmax"       "topt"       "sp1"        "sp2"

If the user wishes to modify the default parameters, this list can be used as the basis for a custom model parameter object, and this passed to run_model or any of its core functions.

Soil input function parameters

The setup for the model input data (build_soil_inputs) also requires a number of core datasets. These are made available to the user in the same way, and can be accessed under crop_agrc (above-ground residue coefficients), crop_bgrc (below-ground residue coefficients), crop_fractions (crop nitrogen and lignin fractions) and man_fractions (manure nitrogen and lignin fractions).

The following may also be useful in determining possible arguments for crop and livestock_type in the add_* functions:

names(crop_fractions)
##  [1] "alfalfa"             "barley"              "beans_and_pulses"   
##  [4] "generic_crops_nos"   "grains_nos"          "grass"              
##  [7] "grass_clover_mix"    "maize"               "millet"             
## [10] "n_fixing_forage"     "non_legume_hay"      "non_n_fixing_forage"
## [13] "oats"                "peanut"              "potato"             
## [16] "rice"                "rye"                 "sorghum"            
## [19] "soybean"             "spring_wheat"        "tubers"             
## [22] "wheat"               "winter_wheat"
names(man_fractions)
## [1] "dairy_cattle" "beef_cattle"  "poultry"      "swine"        "horses"      
## [6] "sheep"        "none"

Acknowledgements

Thanks to the Natural Environment Research Council, the Legumes Translated project, and the University of Edinburgh/Scotland’s Rural College MSc programmes for providing the context, projects and requirements which led to the development of much of the code in this package. Colin Gillespie has been an incredible and approachable mentor, and his input to my coding generally has undoubtedly influenced the development of this package. The package is based around tidyverse ideas and functions, so thanks go also to Hadley Wickham and the tidyverse team for building and maintaining this incredible environment.

Contribute

If you would like to contribute to this package, please file an issue, make a pull request on GitHub, or contact the author on Twitter @alasdair_sykes or by email at alasdair.sykes@sruc.ac.uk.



aj-sykes92/soilc.ipcc documentation built on March 19, 2021, 11:52 a.m.