reco.bulk: Bulk fitting of Reco and GPP models

reco.bulkR Documentation

Bulk fitting of Reco and GPP models

Description

The function allows for bulk fitting of R_eco and GPP models with the respective functions reco and gpp. This is often appropriate because data are gathered over a season, a year or longer...

Usage

reco.bulk(formula, data, INDEX, window = 1, hook = "mean", remove.outliers = FALSE, 
fall.back = TRUE, ...)

gpp.bulk(formula, data, INDEX, window = 1, hook = "mean", oot.id = c("D", "T"), 
min.dp = 5, Reco.m = NULL, ts.Reco = NULL, fall.back = TRUE, ...)

Arguments

formula

An object of class "formula" (or one that can be coerced to that class): a symbolic description of the terms that are used in bulk R_eco and GPP model fitting. Choices of terms are more restricted than typically (see details). For instance, a timestamp always has to be provided. Also, temperature variables are required for gpp.bulk if R_eco values are predicted from models.

data

A data frame (or an object that can be coerced to that class by as.data.frame) containing at least all the 'model' terms specified in formula.

INDEX

A vector of length nrow(data) that is used to extract and compile data, for instance according to measurement campaign in the field. Internally split is used with f = INDEX to create a list of data.frames of which each contains all flux measurements for one model.

window

Both functions can fit the respective models across a moving window of adjacent INDEX values. Not advisable for GPP while R_eco modelling can really profit because more data points often lead to better models.

hook

Character string specifiying the kind of summary statistics used to fix a date and time to which the fitted model shall refer. Up to now this is simply achieved by doing one of these summary statistics on the timestamp: mean, min, max or median.

remove.outliers

Logical. If TRUE the function searches for outliers in the data points of the R_eco models and eliminates them. Per model the boxplot.stats of the residuals are obtained and if outliers are present they are eliminated and the model is fitted again. This is done twice. If the function fails in fitting the model to the new data set it falls back to the original data.

oot.id

Vector of length 2 that specifies which of the flux values derive from opaque (first value, i.e. R_eco measurements) and which derive from transparent (second value, i.e. NEE measurements) chamber measurements when data contains both. This is one of several approaches to GPP modeling here. See details.

min.dp

Numeric. Specifies the minimum number of data points that are accepted per model. Defaults to 5 which is already quite a small number.

Reco.m

Either an object of class "reco" resulting from reco with one R_eco model or an object of class "breco" resulting from reco.bulk with several R_eco models or a vector with estimated half hourly or hourly (or whatever interval you have) R_eco values. In case of the latter ts.Reco has to be specified as well because it is also used as a switch between internal R_eco modeling and assigning existing R_eco values. See details.

ts.Reco

POSIXlt or POSIXct vector with timestamps of the fluxes in Reco.m. Further, the default (ts.Reco = NULL) lets the function expect model object(s) in Reco.m.

fall.back

Logical. When TRUE the function falls back to linear mean models when the non linear approach did not work out (for reco.bulk: the slope of the linear relationship between Reco and temperature is < 0; for gpp.bulk: either no model could be fit or the starting slope parameter alpha is > 0). To do so a virtual data set is created with 50 random GPP values that have the same mean and sd as the original data and with a sequence of 50 PAR values spanning from 0 to 2000. A linear model is fit to these data with lm(GPP ~ PAR).

...

Further arguments passed to reco or gpp e.g., the method for fitting the model when not using the respective defaults.

Details

Models are - comparable to regression models - specified symbolically. Accordingly, the basic form is response ~ terms with response always referring to CO2 exchange rates. For terms requirements differ between the two methods. In contrast to other formulae the response and all terms have to be in data.

reco.bulk expects a formula of the form Reco ~ T1 + ... + timestamp with Reco referring to CO2 fluxes estimated based on opaque chamber measurements (for instance with flux), T1 referring to temperature readings relevant for Reco (e.g. air temperature) and taken during the corresponding chamber measurements. The ... symbolizes that several more temperature readings can be specified if available (e.g. temperature in soil at 2cm), as many as you want. When more than one temperature is specified models are fit for each temperature and the best one is determined via AIC and reported together with the name of the corresponding temperature variable. Finally, timestamp is referring to the POSIXt timestamps that represent the dates and times of the corresponding measurements. timestamp always has to be specified as the last term of the formula. Models are fit using reco.

gpp.bulk expects a formula of the form NEE ~ PAR + timestamp + ... with NEE referring to CO2 fluxes estimated based on transparent chamber measurements (for instance with flux), PAR referring to readings of the photosynthetically active radiation relevant for NEE and taken during the corresponding chamber measurements. The ... symbolizes that several more terms can or have to be specified. This depends on the approach to the R_eco part of the GPP modeling (see gpp).

Approaches to estimate GPP values from measured NEE data using corresponding R_eco values:

Approach 1: Extract corresponding R_eco fluxes from the provided data that are assigned to corresponding NEE values via their timestamp: For this approach data has to contain both NEE and R_eco fluxes and the model formula is specified as NEE ~ PAR + timestamp + oot with the latter referring to a variable that indicates whether the respective fluxes were measured as NEE (transparent chamber) or Reco (opaque chamber or low PAR). In addition oot.id may have to be changed accordingly. gpp2 is used for fitting the models.

Approach 2: Provide measured R_eco fluxes that are assigned to corresponding NEE values via their timestamp: To do this set ts.Reco != NULL and Reco.m a vector of R_eco fluxes and specifiy model with: NEE ~ PAR + timestamp. gpp is used for fitting the models.

Approach 3: Provide one R_eco model to predict R_eco fluxes at the time of the NEE measurements using the same temperature variable that was used to construct the model (with reco). Specify model with: NEE ~ PAR + timestamp + temperature. gpp is used for fitting the models.

Approach 4: Provide several R_eco models to predict R_eco fluxes at the time of the NEE measurements using the same temperature variables that were used to construct the models (with reco.bulk). The corresponding models are assigned to the NEE data via the timestamps that they carry. Specify model with: NEE ~ PAR + timestamp + temperature1 + temperature2 + temperature3 + .... All temperatures that may have been used for fitting the R_eco models (see above) should be given. gpp is used for fitting the models.

remove.outliers may result in better R_eco models. One should be careful with this and watch out for cases in which too many data points are eliminated. The function returns the number of skipped outliers per model to do just that.

If fall.back = TRUE no failed model fits are reported. That's quite useful when further bulk methods like budget.reco or budget.gpp are used to get annual or seasonal budgets.

Value

Both functions return complex list structures with models.

Output of reco.bulk: Object of class "breco", a list with length(unique(INDEX)) elements, each containing 3 elements:

ts

Timestamp of the model.

mod

Has itself two elements. The first contains the model object as returned by reco and is named according to the method used. The second, n.out, is optional (only reported when remove.outliers = TRUE and there were indeed outliers identified and skipped) and gives the number of omitted data points.

which.Temp

Character string that identifies the temperature variable that was finally used for constructing the best model

Output of gpp.bulk: Object of class "bgpp", a list with length(unique(INDEX)) elements each containing itself 2 entries:

ts

Timestamp of the model

mod

Either an object of class "gpp" or of class "gpp2" depending on the approach used. Approaches 1 and 2 return "gpp2" objects, Approaches 3 and 4 return "gpp" objects. See gpp and gpp2 for details.

Author(s)

Gerald Jurasinski, gerald.jurasinski@uni-rostock.de,

with suggestions by Sascha Beetz, sascha.beetz@uni-rostock.de

References

Beetz S, Liebersbach H, Glatzel S, Jurasinski G, Buczko U, Hoper H (2013) Effects of land use intensity on the full greenhouse gas balance in an Atlantic peat bog. Biogeosciences 10:1067-1082

See Also

reco, gpp, gpp2, fluxx, modjust

Examples

## Whole example is consecutive and largely marked as
## not run because parts take longer than
## accepted by CRAN incoming checks.
## Remove first hash in each line to run them.
data(amd)
data(amc)

### Reco ###
## do reco models with 3 campaign wide window and 
## outlier removal (outliers according to models)
# first extract opaque (dark) chamber measurements 
amr <- amd[amd$kind=="D",]

## Nor run ##
## do bulk fitting of reco models (all specified temperatures 
## are tested and the best model (per campaign) is finally stored)
#r.models <- reco.bulk(flux ~ t.air + t.soil2 + t.soil5 + 
#t.soil10 + timestamp, amr, amr$campaign, window=3, 
#remove.outliers=TRUE, method="arr", min.dp=2)
#
## adjust models (BEWARE: stupid models with t1 >= 20 are skipped 
## within the function, this can be changed)
#r.models <- modjust(r.models, alpha=0.1, min.dp=3)
#
## make data.frame (table) for overview of model parameters
## the temperature with which the best model could be fit is reported
## this information also resides in the model objects in r.models
#tbl8(r.models)
#
#### GPP ###
### fit GPP models using method = Falge and min.dp = 5
### and take opaque (dark, i.e. reco) measurements from data
## the function issues a warning because some campaigns have
## not enough data points
#g.models <- gpp.bulk(flux ~ PAR + timestamp + kind, amd, amd$campaign, 
#method="Falge", min.dp=5)
#tbl8(g.models)
#
### alternative approaches to acknowledge reco when fitting GPP models
## we need only fluxes based on transparent chamber measurements (NEE)
#amg <- amd[amd$kind=="T",]
## fit gpp models and predict reco from models
#g.models.a1 <- gpp.bulk(flux ~ PAR + timestamp + t.air + t.soil2 + 
#t.soil5 + t.soil10, amg, amg$campaign, method="Falge", min.dp=5, 
#Reco.m=r.models)
#tbl8(g.models.a1)
## have a look the model fits (first 10)
#par(mfrow=c(5,6))
## select only non linear fits
#sel <- sapply(g.models.a1, function(x) class(x$mod$mg)=="nls")
#lapply(g.models.a1[sel][1:10], function(x) plot(x$mod, single.pane=FALSE))
#
## fit gpp models with providing reco data
## to do so, rerun budget.reco with other start and end points
#set.back <- data.frame(timestamp = c("2009-09-01 00:30", "2011-12-31 23:30"), 
#value = c(-999, -9999))
#set.back$timestamp <- strptime(set.back$timestamp, format="%Y-%m-%d %H:%M")
#r.bdgt.a2 <- budget.reco(r.models, amc, set.back)
## now fit the models
#g.models.a2 <- gpp.bulk(flux ~ PAR + timestamp, amg, amg$campaign, 
#method="Falge", units = "30mins", min.dp=5, Reco.m=r.bdgt.a2$reco.flux, 
#ts.Reco = r.bdgt.a2$timestamp)
#tbl8(g.models.a2)
#
## End not run ##


flux documentation built on June 26, 2022, 9:05 a.m.