filter.data: Basic Eddy Covariance Data Filtering
In bigleaf: Physical and Physiological Ecosystem Properties from Eddy Covariance Data

filter.data

R Documentation

Basic Eddy Covariance Data Filtering

Description

Filters time series of EC data for high-quality values and specified meteorological conditions.

Usage

filter.data(
  data,
  quality.control = TRUE,
  filter.growseas = FALSE,
  filter.precip = FALSE,
  filter.vars = NULL,
  filter.vals.min,
  filter.vals.max,
  NA.as.invalid = TRUE,
  vars.qc = NULL,
  quality.ext = "_qc",
  good.quality = c(0, 1),
  missing.qc.as.bad = TRUE,
  GPP = "GPP",
  doy = "doy",
  year = "year",
  tGPP = 0.5,
  ws = 15,
  min.int = 5,
  precip = "precip",
  tprecip = 0.01,
  precip.hours = 24,
  records.per.hour = 2,
  filtered.data.to.NA = TRUE,
  constants = bigleaf.constants()
)

Arguments

`data`	Data.frame or matrix containing all required input variables in half-hourly or hourly resolution. Including year, month, day information
`quality.control`	Should quality control be applied? Defaults to `TRUE`.
`filter.growseas`	Should data be filtered for growing season? Defaults to `FALSE`.
`filter.precip`	Should precipitation filtering be applied? Defaults to `FALSE`.
`filter.vars`	Additional variables to be filtered. Vector of type character.
`filter.vals.min`	Minimum values of the variables to be filtered. Numeric vector of the same length than `filter.vars`. Set to `NA` to be ignored.
`filter.vals.max`	Maximum values of the variables to be filtered. Numeric vector of the same length than `filter.vars`. Set to `NA` to be ignored.
`NA.as.invalid`	If `TRUE` (the default) missing data are filtered out (applies to all variables).
`vars.qc`	Character vector indicating the variables for which quality filter should be applied. Ignored if `quality.control = FALSE`.
`quality.ext`	The extension to the variables' names that marks them as quality control variables. Ignored if `quality.control = FALSE`.
`good.quality`	Which values indicate good quality (i.e. not to be filtered) in the quality control (qc) variables? Ignored if `quality.control = FALSE`.
`missing.qc.as.bad`	If quality control variable is `NA`, should the corresponding data point be treated as bad quality? Defaults to `TRUE`. Ignored if `quality.control = FALSE`.
`GPP`	Gross primary productivity (umol m-2 s-1); Ignored if `filter.growseas = FALSE`.
`doy`	Day of year; Ignored if `filter.growseas = FALSE`.
`year`	Year; Ignored if `filter.growseas = FALSE`.
`tGPP`	GPP threshold (fraction of 95th percentile of the GPP time series). Must be between 0 and 1. Ignored if `filter.growseas` is `FALSE`.
`ws`	Window size used for GPP time series smoothing. Ignored if `filter.growseas = FALSE`.
`min.int`	Minimum time interval in days for a given state of growing season. Ignored if `filter.growseas = FALSE`.
`precip`	Precipitation (mm time-1)
`tprecip`	Precipitation threshold used to identify a precipitation event (mm). Ignored if `filter.precip = FALSE`.
`precip.hours`	Number of hours removed following a precipitation event (h). Ignored if `filter.precip = FALSE`.
`records.per.hour`	Number of observations per hour. I.e. 2 for half-hourly data.
`filtered.data.to.NA`	Logical. If `TRUE` (the default), all variables in the input data.frame/matrix are set to `NA` for the time step where ANY of the `filter.vars` were beyond their acceptable range (as determined by `filter.vals.min` and `filter.vals.max`). If `FALSE`, values are not filtered, and an additional column 'valid' is added to the data.frame/matrix, indicating if any value of a row did (1) or did not fulfill the filter criteria (0).
`constants`	frac2percent - conversion between fraction and percent

Details

This routine consists of two parts:

1) Quality control: All variables included in vars.qc are filtered for good quality data. For these variables a corresponding quality variable with the same name as the variable plus the extension as specified in quality.ext must be provided. For time steps where the value of the quality indicator is not included in the argument good.quality, i.e. the quality is not considered as 'good', its value is set to NA.

2) Meteorological filtering. Under certain conditions (e.g. low ustar), the assumptions of the EC method are not fulfilled. Further, some data analysis require certain meteorological conditions, such as periods without rainfall, or active vegetation (growing season, daytime). The filter applied in this second step serves to exclude time periods that do not fulfill the criteria specified in the arguments. More specifically, time periods where one of the variables is higher or lower than the specified thresholds (filter.vals.min and filter.vals.max) are set to NA for all variables. If a threshold is set to NA, it will be ignored.

Value

If filtered.data.to.NA = TRUE (default), the input data.frame/matrix with observations which did not fulfill the filter criteria set to NA. If filtered.data.to.NA = FALSE, the input data.frame/matrix with an additional column "valid", which indicates whether all the data of a time step fulfill the filtering criteria (1) or not (0).

Note

The thresholds set with filter.vals.min and filter.vals.max filter all data that are smaller than ("<"), or greater than (">") the specified thresholds. That means if a variable has exactly the same value as the threshold, it will not be filtered. Likewise, tprecip filters all data that are greater than tprecip.

Variables considered of bad quality (as specified by the corresponding quality control variables) will be set to NA by this routine. Data that do not fulfill the filtering criteria are set to NA if filtered.data.to.NA = TRUE. Note that with this option *all* variables of the same time step are set to NA. Alternatively, if filtered.data.to.NA = FALSE data are not set to NA, and a new column "valid" is added to the data.frame/matrix, indicating if any value of a row did (1) or did not fulfill the filter criteria (0).

Examples

# Example of data filtering; data are for a month within the growing season,
# hence growing season is not filtered.
# If filtered.data.to.NA=TRUE, all values of a row are set to NA if one filter
# variable is beyond its bounds. 
DE_Tha_Jun_2014_2 <- filter.data(DE_Tha_Jun_2014,quality.control=FALSE,
                                 vars.qc=c("Tair","precip","H","LE"),
                                 filter.growseas=FALSE,filter.precip=TRUE,
                                 filter.vars=c("Tair","PPFD","ustar"),
                                 filter.vals.min=c(5,200,0.2),
                                 filter.vals.max=c(NA,NA,NA),NA.as.invalid=TRUE,
                                 quality.ext="_qc",good.quality=c(0,1),
                                 missing.qc.as.bad=TRUE,GPP="GPP",doy="doy",
                                 year="year",tGPP=0.5,ws=15,min.int=5,precip="precip",
                                 tprecip=0.1,precip.hours=24,records.per.hour=2,
                                 filtered.data.to.NA=TRUE)

 ## same, but with filtered.data.to.NA=FALSE
 DE_Tha_Jun_2014_3 <- filter.data(DE_Tha_Jun_2014,quality.control=FALSE,
                                 vars.qc=c("Tair","precip","H","LE"),
                                 filter.growseas=FALSE,filter.precip=TRUE,
                                 filter.vars=c("Tair","PPFD","ustar"),
                                 filter.vals.min=c(5,200,0.2),
                                 filter.vals.max=c(NA,NA,NA),NA.as.invalid=TRUE,
                                 quality.ext="_qc",good.quality=c(0,1),
                                 missing.qc.as.bad=TRUE,GPP="GPP",doy="doy",
                                 year="year",tGPP=0.5,ws=15,min.int=5,precip="precip",
                                 tprecip=0.1,precip.hours=24,records.per.hour=2,
                                 filtered.data.to.NA=FALSE)
                                 
 # note the additional column 'valid' in DE_Tha_Jun_2014_3.
 # To remove time steps marked as filtered out (i.e. 0 values in column 'valid'):
 DE_Tha_Jun_2014_3[DE_Tha_Jun_2014_3["valid"] == 0,] <- NA

bigleaf documentation built on Aug. 22, 2022, 9:09 a.m.

bigleaf index

Tutorial on bigleaf Tutorial on bigleaf

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

bigleaf
Physical and Physiological Ecosystem Properties from Eddy Covariance Data

filter.data: Basic Eddy Covariance Data Filtering
In bigleaf: Physical and Physiological Ecosystem Properties from Eddy Covariance Data

Basic Eddy Covariance Data Filtering

Description

Usage

Arguments

Details

Value

Note

Examples

Related to filter.data in bigleaf...

R Package Documentation

Browse R Packages

We want your feedback!

bigleaf Physical and Physiological Ecosystem Properties from Eddy Covariance Data

filter.data: Basic Eddy Covariance Data Filtering In bigleaf: Physical and Physiological Ecosystem Properties from Eddy Covariance Data

Basic Eddy Covariance Data Filtering

Description

Usage

Arguments

Details

Value

Note

Examples

Related to filter.data in bigleaf...

R Package Documentation

Browse R Packages

We want your feedback!

bigleaf
Physical and Physiological Ecosystem Properties from Eddy Covariance Data

filter.data: Basic Eddy Covariance Data Filtering
In bigleaf: Physical and Physiological Ecosystem Properties from Eddy Covariance Data