soilc.ipcc
packageDr Alasdair Sykes 03/01/2021
tidyverse
environmentSoil carbon is currently one of the most studied and debated topics in the field of agriculture and climate change. Like much which occurs on the boundary of human management and biophysical systems, it’s also extremely challenging to measure or model accurately. To get practical answers to land managers, academics typically need to rely either on simple, empirically calibrated methods, or on highly complex process-based models. Both methods have drawbacks; the former is imprecise, the latter time-consuming, and both are difficult to extrapolate. Finding some middle ground, the Intergovernmental Panel on Climate Change recently (June 2019) released a simplified, globally calibrated three-pool process-based soil carbon model, along with accompanying empirical methods to estimate the required input parameters. The full details of this model are given on the homepage of the 2019 Refinement to the 2006 IPCC Guidelines for National Greenhouse Gas Inventories.
This package ties all of this information together in an easy-to-use,
flexible model implementation which works from start to end of the
modelling process — beginning with management information, ending with a
soil carbon stock estimate — and which can be run as a seamless part of
a tidyverse
-based workflow.
This package was imagined and written by Dr Alasdair Sykes (@alasdairsykes). If you use this package in your work, please cite it as:
Sykes, A. J. (2020) soilc.ipcc
: An R implementation of the 2019 IPCC
Tier 2 steady-state method for soil carbon stock estimation in cropland.
Version 1.1.2. Available at https://github.com/aj-sykes92/soilc.ipcc.
Disclaimer: This package is entirely the creation of its author, and no warranty is given or implied as to its use, or to its fidelity to the modelling framework it represents (with that said, my intention is to reproduce this framework faithfully, so if you notice any errors or omissions, please let me know)!
This package is hosted on Github and can be installed using the
devtools
package:
# install.packages("devtools")
devtools::install_github("aj-sykes92/soilc.ipcc")
Note: If this doesn’t work, update
devtools
, as Github recently changed the term it uses for the primary version of a source code repository.
run_model
The development process for package aimed to tread the line between
making things ‘just work’ and giving the accomplished user enough
flexibility to use and modify any part of the model independently. In
the spirit of doing the fun stuff first, this section will demonstrate
the run_model
function — this is a complete, out-of-the-box soil
carbon model which will give quick and painless carbon stock change
estimates when passed data of the right kind.
To help out here, the package also contains a small dataset of exactly
the right kind, called toy_input
:
glimpse(toy_input)
## Rows: 10
## Columns: 7
## $ year <int> 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020
## $ climdata <list> [<tbl_df[12 x 3]>, <tbl_df[12 x 3]>, <tbl_df[12 x 3]>, <…
## $ c_input <dbl> 5.298971, 4.945811, 5.326737, 4.754702, 4.488324, 4.85921…
## $ lignin_frac <dbl> 0.005905677, 0.006083535, 0.006065450, 0.005974941, 0.005…
## $ n_frac <dbl> 0.05013271, 0.04963280, 0.05039613, 0.05054946, 0.0500886…
## $ sand_frac <dbl> 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43
## $ till_type <chr> "full", "full", "full", "full", "full", "full", "full", "…
It’s an ordinary tibble, with one list column; this contains a list of nested climate data, each of dimensions 3 * 12:
glimpse(toy_input$climdata[[1]])
## Rows: 12
## Columns: 3
## $ temp <dbl> 14.10027, 17.39389, 17.59273, 15.41833, 15.20850, 16.24473, 17…
## $ precip <dbl> 53.39583, 58.48306, 55.08518, 57.16217, 53.38465, 52.36533, 55…
## $ pet <dbl> 46.52958, 43.81984, 42.76176, 44.95919, 44.06107, 40.87934, 44…
Passed a dataset of this type, the run_model
function will output
results with no further input required:
toy_input %>%
run_model() %>%
glimpse()
## Rows: 10
## Columns: 14
## $ year <int> 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, …
## $ c_input <dbl> 5.298971, 4.945811, 5.326737, 4.754702, 4.488324, 4.85…
## $ lignin_frac <dbl> 0.005905677, 0.006083535, 0.006065450, 0.005974941, 0.…
## $ n_frac <dbl> 0.05013271, 0.04963280, 0.05039613, 0.05054946, 0.0500…
## $ sand_frac <dbl> 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, …
## $ till_type <chr> "full", "full", "full", "full", "full", "full", "full"…
## $ tfac <dbl> 0.5189817, 0.4939976, 0.4878877, 0.4758409, 0.4754930,…
## $ wfac <dbl> 2.223240, 2.175849, 2.226038, 2.194289, 2.171587, 2.20…
## $ climdata <list> [<tbl_df[12 x 3]>, <tbl_df[12 x 3]>, <tbl_df[12 x 3]>…
## $ active_y <dbl> 0.1820583, 0.1823888, 0.1944128, 0.1805120, 0.1723055,…
## $ slow_y <dbl> 1.694165, 1.692363, 1.768588, 1.705763, 1.634969, 1.69…
## $ passive_y <dbl> 37.39614, 37.39293, 37.40793, 37.40198, 37.38430, 37.3…
## $ total_y <dbl> 39.27237, 39.26768, 39.37093, 39.28825, 39.19157, 39.2…
## $ c_stock_change <dbl> -0.023740126, -0.004686108, 0.103255479, -0.082681955,…
A great deal of intermediate calculations are generated and dropped when
run_model
executes with its default arguments; for more information on
the model run process, and a better ability to interrogate the results,
it’s best to alter the default arguments a bit:
toy_input %>%
run_model(drop_prelim = FALSE, drop_runin = FALSE) %>%
glimpse()
## Rows: 11
## Columns: 22
## $ year <int> NA, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 20…
## $ c_input <dbl> 4.958704, 5.298971, 4.945811, 5.326737, 4.754702, 4.48…
## $ lignin_frac <dbl> 0.006005288, 0.005905677, 0.006083535, 0.006065450, 0.…
## $ n_frac <dbl> 0.05024811, 0.05013271, 0.04963280, 0.05039613, 0.0505…
## $ sand_frac <dbl> 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, …
## $ till_type <chr> "full", "full", "full", "full", "full", "full", "full"…
## $ tfac <dbl> 0.4850891, 0.5189817, 0.4939976, 0.4878877, 0.4758409,…
## $ wfac <dbl> 2.195574, 2.223240, 2.175849, 2.226038, 2.194289, 2.17…
## $ climdata <list> [NULL, <tbl_df[12 x 3]>, <tbl_df[12 x 3]>, <tbl_df[12…
## $ tillfac <dbl> 3.036, 3.036, 3.036, 3.036, 3.036, 3.036, 3.036, 3.036…
## $ alpha <dbl> 2.528185, 2.701829, 2.521510, 2.715727, 2.424212, 2.28…
## $ k_a <dbl> 13.69867, 14.84046, 13.82492, 13.96887, 13.42965, 13.2…
## $ k_s <dbl> 0.6757991, 0.7321271, 0.6820271, 0.6891289, 0.6625271,…
## $ k_p <dbl> 0.007338187, 0.007949827, 0.007405814, 0.007482929, 0.…
## $ active_y_ss <dbl> 0.1845569, 0.1820583, 0.1823888, 0.1944128, 0.1805120,…
## $ slow_y_ss <dbl> 1.711369, 1.687870, 1.691523, 1.802973, 1.673761, 1.59…
## $ passive_y_ss <dbl> 37.40018, 36.89231, 36.96202, 39.39844, 36.58002, 34.9…
## $ active_y <dbl> 0.1845569, 0.1820583, 0.1823888, 0.1944128, 0.1805120,…
## $ slow_y <dbl> 1.711369, 1.694165, 1.692363, 1.768588, 1.705763, 1.63…
## $ passive_y <dbl> 37.40018, 37.39614, 37.39293, 37.40793, 37.40198, 37.3…
## $ total_y <dbl> 39.29611, 39.27237, 39.26768, 39.37093, 39.28825, 39.1…
## $ c_stock_change <dbl> 0.000000000, -0.023740126, -0.004686108, 0.103255479, …
You can also choose whether or not to deal with the nested climate data
inside or outside of the run_model
function. This is useful primarily
because the role of climate data varies with model usage — where it
might be important to have annually-specific climate datasets for one
use-case, an annual average will do for another. This climate data is a
multiplying factor for file size, so it’s useful to not be forced to
over-inflate a file through repetitions. Here’s how this works:
# with automated climfac calculation
x <- run_model(toy_input) # default calculate_climfacs = TRUE
# with manual climfac calculation
y <- toy_input %>%
mutate(
tfac = map_dbl(climdata, ~tfac(.x$temp)),
wfac = map_dbl(climdata, ~wfac(.x$precip, .x$pet))
) %>%
run_model(calculate_climfacs = FALSE)
Here, x
and y
are identical datasets. Obviously, the x
method is
more efficient if you need every climdata
object to be different, by
the y
method separates this operation from run_model
in case you
want to calculate tfac
and wfac
first, and repeat them over multiple
datasets (say, perhaps, 10 different treatments in the same study site)
before running the model.
The run_model
function also contains default arguments naming each of
the individual parameters it requires from the data
object — it’s
anticipated these will be ignored 99.9% of the time, but they provide
flexibility so you can modify the function behaviour rather than
changing your data if you so desire.
The toy_input
dataset we’ve been working with thus far contains
exactly what the soil carbon model needs to know in order to make an
estimate of soil C stocks; however, the type of information contained
(e.g. carbon inputs, lignin fractions; run ?toy_input
for full
descriptions of all the variables) is rarely readily available to most
land managers. To help set up the model, this package also contains
functions for processing the kind of information which a land manager is
likely to know (e.g. crop yields, manure application rates) and converts
it into an input that the model expects.
To make this happen, the package has a two-tiered set of functions. The
highest tier contains just one function, build_soil_input
— this
function can take as many arguments as you care to pass it, and it
expects the output of a set of second-tier helper functions to form
these inputs. These functions are add_crop
and add_manure
. We will
demonstrate these first:
crop1 <- add_crop(crop = "wheat", yield_tha = rep(7.2, 10), frac_remove = 0.3, frac_renew = 1)
manure1 <- add_manure(livestock_type = "beef_cattle", n_rate = rep(100, 10))
str(crop1)
## List of 4
## $ crop : chr "wheat"
## $ yield_tha : num [1:10] 7.2 7.2 7.2 7.2 7.2 7.2 7.2 7.2 7.2 7.2
## $ frac_remove: num 0.3
## $ frac_renew : num 1
## - attr(*, "class")= chr [1:2] "crop" "list"
str(manure1)
## List of 2
## $ livestock_type: chr "beef_cattle"
## $ n_rate : num [1:10] 100 100 100 100 100 100 100 100 100 100
## - attr(*, "class")= chr [1:2] "manure" "list"
These list objects are admittedly not very human-readable, but they
aren’t supposed to be. When passed to build_soil_input
, the function
maps an S3 method across the arguments list, and uses the information
contained to make an estimate of the key parameters:
build_soil_input(crop1, manure1) %>%
glimpse()
## Rows: 10
## Columns: 6
## $ om_input <dbl> 15.71912, 15.71912, 15.71912, 15.71912, 15.71912, 15.719…
## $ c_input <dbl> 6.675944, 6.675944, 6.675944, 6.675944, 6.675944, 6.6759…
## $ n_input <dbl> 0.1790305, 0.1790305, 0.1790305, 0.1790305, 0.1790305, 0…
## $ lignin_input <dbl> 0.9939831, 0.9939831, 0.9939831, 0.9939831, 0.9939831, 0…
## $ n_frac <dbl> 0.01138935, 0.01138935, 0.01138935, 0.01138935, 0.011389…
## $ lignin_frac <dbl> 0.063234, 0.063234, 0.063234, 0.063234, 0.063234, 0.0632…
The output tibble is of correct format to provide the organic matter
elements to a run_model
input dataframe. Multiple manure applications
— or complex management strategies like cover cropping — can be
absorbed since build_soil_input
has no expectations as to the number
of arguments. Note also that the single-value (scalar) elements
specified in add_crop
and add_manure
are recycled, and the output is
10 rows long, the same as the vector elements. If you pass vectors of
different > 1 lengths to build_soil_input
, it will pad the end of the
shorter vectors with zeroes to enable it to continue, and warn you that
you’ve probably/possibly made an error in your inputs.
Usually when you’re running a model of this type, you want to do it more
than once. Whether it’s multiple years, multiple climate scenarios,
multiple grid cells or multiple management scenarios, it’s very unlikely
you’re going to be one-and-done with your modelling. With this in mind,
this package has been designed to work as well as possible with
tidyverse
’s purrr
, and the list-wise and nested-tibble operations
this allows. All the functions — and particularly run_model
— work
well with lists and it is anticipated they will be used in this context.
This vignette is not intended to be a tutorial on purrr
(go
here for that), so the best way
to demonstrate is by example. The header plot for this vignette was
built with the toy_input
used to demonstrate the package, along with
some random sampling and iteration with purrr
.
# function build 100-row dataset by replicating toy_input 10x
build_fake_data <- function(data) {
map(1:10, ~data) %>%
bind_rows() %>%
mutate(year = 2001:2100,
mod_fac = seq(from = 1, to = runif(1, 0.5, 2), length.out = 100) + rnorm(100, 0, 0.02),
c_input = c_input * mod_fac,
till_type = ifelse(year > sample(2001:2100, 1), "reduced", till_type)) %>%
select(-mod_fac)
}
# build 10 fake datasets with random inputs
set.seed(2065)
fake_data <- map(1:10, ~build_fake_data(toy_input))
# map run model across list
output <- map(fake_data, run_model)
# plot
output %>%
set_names(paste0("run_", 1:10)) %>%
bind_rows(.id = "run") %>%
ggplot(aes(x = year, y = total_y, group = run)) +
geom_line(colour = "darkred",) +
labs(x = "Year",
y = expression("Soil C stocks, tonnes C ha"^{-1})) +
theme_classic()
The toy_input
data used in the above examples is just the tip of the
iceberg for package data. This section covers the data utilised in
package functions, which is also made available to the user for
information and modification if desired.
All functions which run under the wrapper function run_model
, and this
wrapper function itself, make use of a named list object containing
model parameters. This object is passed to the final argument of each
function, params
. The default parameter list can be directly accessed
by the user, and contains all parameters and accompanying descriptions:
head(soilc_params, 1)
## $tillfac_ft
## $tillfac_ft$be
## [1] 3.036
##
## $tillfac_ft$sd
## [1] 0.579
##
## $tillfac_ft$min
## [1] 1.4
##
## $tillfac_ft$max
## [1] 4
##
## $tillfac_ft$desc
## [1] "Tillage disturbance modifier for decay rates under full tillage"
names(soilc_params)
## [1] "tillfac_ft" "tillfac_rt" "tillfac_nt" "wfacpar1" "wfacpar2"
## [6] "wfacpar3" "kfaca" "k3par1" "k3par2" "kfacs"
## [11] "kfacp" "f1" "f2" "f2_ft" "f2_rt"
## [16] "f2_nt" "f3" "f4par1" "f4par2" "f5"
## [21] "f6" "f7" "f8" "ta" "tb"
## [26] "tmax" "topt" "sp1" "sp2"
If the user wishes to modify the default parameters, this list can be
used as the basis for a custom model parameter object, and this passed
to run_model
or any of its core functions.
The setup for the model input data (build_soil_inputs
) also requires a
number of core datasets. These are made available to the user in the
same way, and can be accessed under crop_agrc
(above-ground residue
coefficients), crop_bgrc
(below-ground residue coefficients),
crop_fractions
(crop nitrogen and lignin fractions) and
man_fractions
(manure nitrogen and lignin fractions).
The following may also be useful in determining possible arguments for
crop
and livestock_type
in the add_*
functions:
names(crop_fractions)
## [1] "alfalfa" "barley" "beans_and_pulses"
## [4] "generic_crops_nos" "grains_nos" "grass"
## [7] "grass_clover_mix" "maize" "millet"
## [10] "n_fixing_forage" "non_legume_hay" "non_n_fixing_forage"
## [13] "oats" "peanut" "potato"
## [16] "rice" "rye" "sorghum"
## [19] "soybean" "spring_wheat" "tubers"
## [22] "wheat" "winter_wheat"
names(man_fractions)
## [1] "dairy_cattle" "beef_cattle" "poultry" "swine" "horses"
## [6] "sheep" "none"
Thanks to the Natural Environment Research Council, the Legumes
Translated project, and the University of Edinburgh/Scotland’s Rural
College MSc programmes for providing the context, projects and
requirements which led to the development of much of the code in this
package. Colin Gillespie has been an
incredible and approachable mentor, and his input to my coding generally
has undoubtedly influenced the development of this package. The package
is based around tidyverse
ideas and functions, so thanks go also to
Hadley Wickham and the tidyverse
team for building and maintaining
this incredible environment.
If you would like to contribute to this package, please file an issue, make a pull request on GitHub, or contact the author on Twitter @alasdair_sykes or by email at alasdair.sykes@sruc.ac.uk.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.