calculate_statstable: Calculates stats described in a table

View source: R/statstable.R

calculate_statstableR Documentation

Calculates stats described in a table

Description

Calculate a lot of statistics defined in a table with various caveats:

  • series always padded to complete years

  • the same data threshold for all calculations

  • to h8gl: only mean from h1

  • max_gap only by to y1, and definition in days

  • usage of default_statistic and ⁠_inputs_⁠ can contain suprises in the result

All statistics are defined in a table with the columns "parameter", "statistic", "from" and "to". Each row contains one statistic for one parameter with a basis interval ("from") and the target interval ("to"). The rows are then grouped with "from", then with "to" and then with "parameter". This results in a list of statistic for each parameter. This list is compatible with resample(). If no default_statistic is defined, default_statistic = "drop" is added. Multi-step statistics are possible. ⁠_inputs_⁠ can be used as substitute for default_statistic in multi-step calculation if the input in "from" already contains calculated statistics. The statstable can be written in a compact form with comma seperated values in each cells. For each value the table will be expanded and a row added. See statstable_expand()

Usage

calculate_statstable(
  data,
  statstable,
  sep = "\\s*,\\s*",
  keep_input = FALSE,
  data_thresh = 0.8,
  max_gap = 10,
  order = c("input", "h1", "h8gl", "d1", "m1", "y1")
)

Arguments

data

input data in rolf format

statstable

description of statistics to calculate in table form

sep

seperator for combined values in statstable

keep_input

should the input data be kept in return list as item input. Default FALSE

data_thresh

minimum data capture threshold 0 - 1.0 to use. Default 0.8

max_gap

in days. Only used in calculation to y1. Set to NULL to disable usage. Default 10 days

order

defines the order of calculation in the from column

Value

list with one item for every to interval

Examples

# calculate LRV statisitcs
lrv_table <- tibble::tribble(
  ~parameter, ~statistic, ~from, ~to,
  "SO2, NO2, PM10", "mean", "input", "y1",
  "SO2, NO2", "perc95", "input", "y1",
  "O3", "perc98", "input", "m1",
  "O3", "mean", "input", "h1",
  "O3", "n>120", "h1", "y1",
  "SO2, NO2, CO, PM10", "mean", "input", "d1",
  "SO2", "n>100", "d1", "y1",
  "NO2", "n>80", "d1", "y1",
  "CO", "n>8", "d1", "y1",
  "PM10", "n>50", "d1", "y1"
)

fn <- system.file("extdata", "Zch_Stampfenbachstrasse_min30_2017.csv",
                   package = "rOstluft.data", mustWork = TRUE)

data <- read_airmo_csv(fn)

# convert volume concentrations to mass concentrations
data <- calculate_mass_concentrations(data)

stats <- calculate_statstable(data, lrv_table)

# we are only interested in the m1 and y1 results
stats <- dplyr::bind_rows(stats$y1, stats$m1)
stats

# calculate clima indicators
clima_table <- tibble::tribble(
   ~parameter, ~statistic, ~from, ~to,
   "T", "mean", "input", "h1",
   "T", "max, min", "h1", "d1",
   "T_max_h1", "Sommertage, Hitzetage, Eistage", "d1", "y1",
   "T_min_h1", "Tropennächte, Frosttage", "d1", "y1",
)
clima_stats <- calculate_statstable(data, clima_table)
clima_stats$y1

Ostluft/rOstluft documentation built on Feb. 6, 2024, 1:26 a.m.