budget.reco: Predict fluxes from GPP and Reco models and prepare for...
In flux: Flux Rate Calculation from Dynamic Closed Chamber Measurements

budget.reco

R Documentation

Predict fluxes from GPP and Reco models and prepare for summing them up to budgets.

Description

The functions predict fluxes from GPP and R_eco models and prepare the data for summing them up to budgets including feeding in set.back positions and values (e.g. to truncate the output or to acknowledge the harvesting of biomass) and several adjustments and corrections.

Usage

budget.reco(models, new.data, set.back = NULL, time.unit = "extract", 
adjust = TRUE, correct = list(thresh = "get", cvrm = TRUE, wndw = 12, 
intvl = 0.95), return.models = FALSE)

budget.gpp(models, new.data, set.back = NULL, time.unit = "extract", 
adjust = FALSE, correct = list(thresh = "get", cvrm = TRUE, wndw = 12, 
intvl = 0.95), PAR.correct = 100, return.models = FALSE)

Arguments

`models`	List model objects of class `"breco"` or `"bgpp"`
`new.data`	Data from a climate station or other logging facility including timestamps. For R_eco all temperature variables that have been used to construct the models are needed. For GPP PAR values are needed and the respective column must be named "PAR". The timestamp column (`POSIXt`) must be named "timestamp".
`set.back`	Data.frame with two columns, the first a `timestamp` and the second either a `value` of R_eco or GPP, to which fluxes should be set at the date and time in timestamp. Additionally, the second column can contain a value of `-999` or `-9999` for the start or end of the data series during budgeting.
`time.unit`	By default the time intervals between data points in `new.data` are extracted from the `timestamp` vector. However, by setting `time.unit` you can additionally give a number that represents the interval in seconds. The function uses this information to linearly interpolate values via `lips` when some are missing.
`adjust`	Logical. When `TRUE`, predicted fluxes are adjusted by fitting a linear model to the relationship of predicted to measured fluxes and extracting the slope. This slope value is then used to adjust all predicted fluxes by calculating `flux` / `slope`. This is applied to the finished time series of fluxes. For `budget.reco` default is `TRUE` whereas for `budget.gpp` default is `FALSE`. See details.
`correct`	This triggers a further correction: Unreasonably high fluxes or spikes are identified and eliminated. This is applied to the finished time series of fluxes. The argument requires a named list with the entries `thresh`, `cvrm` (logical), `wndw`, and `intvl`. When no correction is wanted set `correct` = `NULL`. See details.
`PAR.correct`	Numeric representing the value of PAR (e.g. PPFD in micromole per squaremeter per second) below which GPP is assumed to be 0. This is applied to the finished time series of fluxes. All predicted GPP fluxes are corrected to 0 according to this value. See details.
`return.models`	Logical. Shall models be returned?

Details

How does it work? Both functions take a list of model objects (of class "breco" for R_eco models and of class "bgpp" for GPP) and predict fluxes via predict.nls. The required variables are taken from new.data. Fluxes are predicted for each model either for the whole time interval in new.data or when set.back is given with start and end configuration (values -999 and -9999) for the defined time period. If the integration period is defined with set.back, the models that are temporally closest to the start and end times are duplicated and get the respective timestamps. With all models, fluxes are predicted forward from the date and time the models are hooked onto (argument hook in reco.bulk and gpp.bulk) up to the next model's date and time as well as backward down to the previous model's date and time. The resulting two flux vectors for the time period between two consecutive models (one based on the first, the other based on the second model) are linearly interpolated (see below). The same is done for the model errors. Errors are extracted from the model objects and assigned to each interpolated flux from one model to the next and to the previous. Then they are linearly interpolated (see below).

Further, IDs are assigned to the weighted fluxes to identify the periods between two models. This is needed later for model error propagation. The fluxes in the time series between the first and the second model in the whole list get ID = 1, the fluxes in the time series between the second and the third model in the whole list get ID = 2, and so on.

How does interpolation work? Linear interpolation of the two vectors (one based on the first model, the other based on the second model) is achieved by calculating weighted means of the integrated fluxes for each date and time with the weights the distances in time to the corresponding model timestamp. Thus, close to the one model timestamp the weighted mean of the two fluxes is almost entirely determined by that model, in the middle of the time series between two model dates both fluxes contribute equally to the mean and close to the other model timestamp the weighted mean of the two fluxes is almost entirely determined by that other model.

Why adjust? It may happen that in sum the various models in a seasonal, annual or even bigger dat set tend to over- or underestimate the measured fluxes. To correct for this, the predicted fluxes can be adjusted as explained above. This typically leads to better overall modelling performance.

Why correct? Especially with locally fitted R_eco models it happens - most often with winter data - that predicted fluxes are much higher than supported by the measurements because of the exponential element in fitting R_eco models. The model itself is fine but may have been fitted to temperature data spanning a relatively small range. If the temperatures in new.data are much higher (in relative terms), then unreasonable high fluxes may result. Such fluxes are identified and eliminated by correct.

How is corrected? thresh specifies a maximum predicted flux allowed. When set to "get" (default) it is determined based on the data that were used to construct the models and represents the highest flux ever measured across all campaigns of the series. cvrm (logical) triggers whether further despiking should be done with wndw the width of the moving window in which the coefficient of variation (cv) is calculated and intvl the probability of the quantile against which cv should be tested. All fluxes with corresponding cv > quantile(cv, intvl) are eliminated.

Why PAR.correct? Because the function not only predicts with all the provided models using new.data but also interpolates linearly between models, it happens that unreasonable GPP values are predicted reflecting photosynthesis under no or very low light conditions. These are just set to plant physiologically sensible 0. PAR.correction is done after inserting set.back(s) for cutting(s).

All data gaps resulting from corrections are then filled with linear interpolation from the values adjacent to the gap via lips.

How set.back is used to factor in cut dates or the like:. When further set.back values are given in addition to the definition of the start and end dates, e.g. to acknowledge for biomass removal when predicting GPP fluxes, two things happen. First, the time series of fluxes resulting from all of the above is changed like this: At the date and time of a cut the flux is set to the value given in set.back and then fluxes are linearly interpolated to the next proper model date by weighted means in the same manner as described above. Second, cut models are defined as linear models and integrated into the model list according to their timestamp. Thus, when run with return.models = TRUE, the updated model list can be used with tbl8 to extract the relevant model parameters.

Value

Both functions return a data.frame (called tbl) containing the predicted values, timestamps, etc. and optionally an object of class "breco" or "bgpp" (called models) containing the final models including the start and end models and any set.back models when set.backs were specified.

For budget.reco tbl has 4 columns

`reco.flux`	Predicted and interpolated fluxes
`reco.se`	Model errors spawned across predicted and interpolated fluxes
`reco.id`	ID that identifies the model periods
`timestamp`	Timestamps

For budget.gpp tbl has 4 columns

`gpp.flux`	Predicted and interpolated fluxes
`gpp.se`	Model errors spawned across predicted and interpolated fluxes
`gpp.id`	ID that identifies the model periods
`timestamp`	Timestamps

Author(s)

Gerald Jurasinski, gerald.jurasinski@uni-rostock.de,

with ideas by Sascha Beetz, sascha.beetz@uni-rostock.de

References

Beetz S, Liebersbach H, Glatzel S, Jurasinski G, Buczko U, Hoper H (2013) Effects of land use intensity on the full greenhouse gas balance in an Atlantic peat bog. Biogeosciences 10:1067-1082

Examples

### The examples are consecutive and are a suggestion
### how to run the whole process from bulk modelling
### over corrections and checks to full budgets including
### propagated model and interpolation error terms.

## The whole examples section is marked as
## not run because parts take longer than
## accepted by CRAN incoming checks.
## Remove first hash in each line to run them.

## Not run ##
## load data
#data(amd)
#data(amc)
## set global conversion factor
## All fluxes are in micromole * m-2 * s-1,
## thus they are transformed to g CO2-C * m-2 *1/2h
#uf <- 12*60*30/1000000
#
#### Reco ### (for details see reco.bulk)
## extract opaque (dark) chamber measurements 
#amr <- amd[amd$kind=="D",]
## fit reco models
#r.models <- reco.bulk(flux ~ t.air + t.soil2 + t.soil5 + 
#t.soil10 + timestamp, amr, amr$campaign, window=3, 
#remove.outliers=TRUE, method="arr", min.dp=2)
## adjust models
#r.models <- modjust(r.models, alpha=0.1, min.dp=3)
#
### prepare Reco budget (predict half hourly values for two years)
## define set.back with start and end dates only
## (typically you would take this from read in table)
#set.back <- data.frame(timestamp = c("2010-01-01 00:30", "2011-12-31 23:30"), 
#value = c(-999, -9999))
#set.back$timestamp <- strptime(set.back$timestamp, format="%Y-%m-%d %H:%M", tz="GMT")
## run budget function with defaults
#r.bdgt <- budget.reco(r.models, amc, set.back)
## prepare for quality check (global model of predicted ~ measured)
#r.check <- checkm(r.bdgt, amr)
## have a look
#par(pty="s")
#lims <- range(r.check$reco.flux, r.check$flux)
#plot(reco.flux ~ flux, data=r.check, xlim=lims, ylim=lims)
#abline(coef=c(0,1), lty=3)
#mf1 <- lm(reco.flux ~ flux, r.check)
#abline(mf1)
#summary(mf1)
## calculate daily values (better for plotting)
#r.bdgt$day <- format(r.bdgt$timestamp, format="%Y-%m-%d")
#r.daily <- data.frame(day = unique(r.bdgt$day))
#r.daily$day <- as.POSIXct(strptime(r.daily$day, format="%Y-%m-%d", 
#tz="GMT"), tz="GMT")
## in addition to summing up per day, change unit
#r.daily$reco <- tapply(r.bdgt$reco.flux*uf, r.bdgt$day, sum)
## same for the model error terms
#r.daily$se <- tapply(r.bdgt$reco.se*uf, r.bdgt$day, sum)
#
#
#### GPP ### (for details see gpp.bulk)
#g.models <- gpp.bulk(flux ~ PAR + timestamp + kind, amd, amd$campaign, 
#method="Falge", min.dp=5)
#
### prepare GPP budget (predict half hourly values for two years)
## define set.back with start, end, and cut dates
## (typically you would take this from read in table)
#set.back <- data.frame(timestamp = c("2010-01-01 00:30", "2011-12-31 23:30", 
#"2010-07-22 12:00", "2010-09-03 12:00", "2010-10-13 12:00", "2011-06-29 12:00", 
#"2011-08-11 12:00", "2011-10-21 12:00"), value = c(-999, -9999, rep(-0.0001, 6)))
#set.back$timestamp <- strptime(set.back$timestamp, format="%Y-%m-%d %H:%M", tz="GMT")
## have a look at the resulting data.frame
#set.back
## run budget function with correct = NULL and return the models
#g.bdgt <- budget.gpp(g.models, amc, set.back, correct=NULL, return.models=TRUE)
## the cut models are also returned:
#tbl8(g.bdgt$models)
## extract the half hourly values to proceed
#g.bdgt <- g.bdgt$tbl
## make daily values (better for plotting)
#g.bdgt$day <- format(g.bdgt$timestamp, format="%Y-%m-%d")
#g.daily <- data.frame(day = unique(g.bdgt$day))
#g.daily$day <- as.POSIXct(strptime(g.daily$day, format="%Y-%m-%d", tz="GMT"), tz="GMT")
## in addition to summing up per day, change unit
#g.daily$gpp <- tapply(g.bdgt$gpp.flux*uf, g.bdgt$day, sum)
## same for the model error terms
#g.daily$se <- tapply(g.bdgt$gpp.se*uf, g.bdgt$day, sum)
#
#
#### Budgets ###
### doing the actual budgeting
## first bring Reco and GPP budget data together
## because of different handling data sets
## may be of different length, therefore use merge
#r.bdgt$ts <- as.character(r.bdgt$timestamp)
#g.bdgt$ts <- as.character(g.bdgt$timestamp)
#bdgt <- merge(r.bdgt, g.bdgt[, -c(ncol(g.bdgt)-2, ncol(g.bdgt)-1)], 
#by.x="ts", by.y="ts", all.x=TRUE)
## calculate NEE
#bdgt$nee.flux <- bdgt$reco.flux + bdgt$gpp.flux
## error propagation 
#bdgt$nee.se <- sqrt(bdgt$reco.se^2 + bdgt$gpp.se^2)/sqrt(2)
## define unique id that spans across reco and gpp ids
#bdgt$nee.id <- paste(bdgt$reco.id, bdgt$gpp.id, sep=".")
## do budgets of fluxes (sum and use global conversion factor, see above) and error terms
## the model errors are summed up per model id and resulting
## sums are combined following error propagation
#with(bdgt, {c(
#  reco = sum(reco.flux*uf, na.rm=TRUE), 
#	reco.me = sqrt(sum(tapply(reco.se, reco.id, sum)^2))*uf, 
#	gpp = sum(gpp.flux*uf, na.rm=TRUE), 
#	gpp.me = sqrt(sum(tapply(gpp.se, gpp.id, sum)^2))*uf, 
#	nee = sum(nee.flux*uf, na.rm=TRUE), 
#	nee.me = sqrt(sum(tapply(nee.se, nee.id, sum, na.rm=TRUE)^2))*uf
#)})
#
### annual budget incl. interpolation error
#set.back <- data.frame(timestamp = c("2010-01-01 00:30", 
#"2010-12-31 23:30"), value = c(-999, -9999))
#set.back$timestamp <- strptime(set.back$timestamp, format="%Y-%m-%d %H:%M", tz="GMT")
### reco.ie
## redoing budget with annual bounds
#r.bdgt.2010 <- budget.reco(r.models, amc, set.back, return.models=TRUE)
## run budget.ie with lo = 3 and it = 10 (default of 100 is advisable but slow)
#r.ie2010 <- budget.ie(r.bdgt.2010, lo=3, it=10) 
## summary statistic and factor in the uf
#r.ie2010 <- sd(r.ie2010*uf) 
## gpp.ie
## redoing budget with annual bounds
#g.bdgt.2010 <- budget.gpp(g.models, amc, set.back, correct=NULL, return.models=TRUE)
#g.ie2010 <- budget.ie(g.bdgt.2010, lo=3, it=10) 
#g.ie2010 <- sd(g.ie2010*uf) 
#
## do the actual budgeting
#tmp <- bdgt[(bdgt$timestamp >= set.back$timestamp[1]) & 
#(bdgt$timestamp <= set.back$timestamp[2]),]
#with(tmp, {c(
#  reco = sum(reco.flux*uf, na.rm=TRUE), 
#	reco.me = sqrt(sum(tapply(reco.se, reco.id, sum)^2))*uf,
#	reco.ie = r.ie2010,
#	gpp = sum(gpp.flux*uf, na.rm=TRUE), 
#	gpp.me = sqrt(sum(tapply(gpp.se, gpp.id, sum)^2))*uf,
#	gpp.ie = g.ie2010, 
#	nee = sum(nee.flux*uf, na.rm=TRUE), 
#	nee.me = sqrt(sum(tapply(nee.se, nee.id, sum, na.rm=TRUE)^2))*uf,
#	nee.ie = sqrt(r.ie2010^2 + g.ie2010^2)
#)})
#
## End not run ##

flux documentation built on June 26, 2022, 9:05 a.m.