aggregateSolute: Aggregate loads by the time periods specified by the user

Description Usage Arguments Details Value Examples

Description

This will aggregate the total loads or mean concentrations per aggregation interval, specified by agg.by. The time frame specified by agg.by can be "unit," "day," "month," "water year," "calendar year," "total," or the name of a column in newdata that can be used to group the data.

Usage

1
2
3
4
5
6
aggregateSolute(preds, metadata, format = c("conc", "flux rate"),
  agg.by = c("unit", "day", "month", "water year", "calendar year", "total",
  "mean water year", "mean calendar year", "[custom]"), se.preds, dates,
  custom = NA, cormat.function = cormat1DayBand, ci.agg = TRUE,
  level = 0.95, deg.free = NA, ci.distrib = c("lognormal", "normal"),
  se.agg = TRUE, na.rm = FALSE, attach.units = FALSE, min.n = 0)

Arguments

preds

Either a vector of predicted instantaneous fluxes or concentrations or a data.frame containing the columns "fit", "se.pred", and "date"

metadata

A metadata object describing the model

format

character. The desired format of the aggregated values. If "conc", preds is assumed to already be formatted as "conc". If "flux" or "flux rate", preds is assumed to already be formatted as "flux rate". If preds has a "units" attribute, that attribute is checked for consistency with format and metadata, but if preds has no "units" attribute then no such checks can be made.

agg.by

character. The date interval or grouping column by which to aggregate. If agg.by="unit", values will be returned unaggregated but in the standard post-aggregation format. If agg.by is one of "day", "month", "water year", or "calendar year", the dates vector will be split into periods corresponding to those intervals, and the flux or concentration will be computed for each period. If agg.by="total", dates will be ignored and the entire vector preds will be aggregated. If agg.by="[custom]", aggregation will occur for each unique value in dates.

se.preds

A vector of standard errors of prediction for instantaneous flux or concentration predictions. This data may also be given as a column named "se.pred" in preds when preds is a data.frame.

dates

A vector, of the same length as preds, containing the dates to aggregate over. This data may also be given as a column named "date" in preds when preds is a data.frame.

custom

An optional data.frame of one or more columns each containing factors or other labels on which to aggregate. The columns to be used are set by agg.by.

cormat.function

A function that takes a vector of datetimes (Date, POSIXct, chron, etc.) and returns a Matrix indicating the assumed/estimated correlation between prediction errors on each pair of datetimes. See correlations-2D for predefined options.

ci.agg

logical. Should confidence intervals for the aggregate predictions be returned?

level

numeric. The interval to span with the confidence intervals.

deg.free

numeric. The degrees of freedom to use in calculating confidence intervals from SEPs. If NA, a normal distribution is used rather than the more standard t distribution.

ci.distrib

character. The distribution to assume for uncertainty in the aggregate flux or concentration distribution. The default is "lognormal".

se.agg

logical. Should standard errors of the aggregate predictions be returned?

na.rm

logical. Should NA values be ignored during aggregation (TRUE), or should NA be returned for intervals that contain one or more NA predictions (FALSE)?

attach.units

logical. If true, units will be attached as an attribute of the second column of the returned data.frame.

min.n

numeric number of observations below which an agg.by value, e.g. a year, will be considered incomplete and be discarded

Details

This also calculates the uncertainty in the sum over a regular time series (loads) with known standard errors (loadsSEs) for each short-term load estimate.

The general equation for propagation of error in a sum of potentially autocorrelated values is:

sum_t(var(x[t])) + 2*sum_a,b(cov(x_a,a_t+l))

where we will assume something about the covariance matrix.

However, we will deviate from the above equation to accommodate the lognormal distribution of each flux prediction.

Value

A data.frame with two columns. The first contains the aggregation period or custom aggregation unit and is named after the value of agg.by. The second contains the aggregate flux or concentration estimates and is named after the value of format. The values in the second column will be in the units specified by metadata.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## Not run: 
data(eg_metadata)
metadata_example <- updateMetadata(eg_metadata, dates="date")
preds_example <- data.frame(fit=abs(rnorm(365, 5, 2)), se.pred=abs(rnorm(365, 1, 0.2)), 
  date=seq(as.Date("2018-05-15"), as.Date("2019-05-14"), by=as.difftime(1, units="days")))
aggregateSolute(preds_example, metadata=metadata_example, format="conc", agg.by="month")

# with a custom aggregation group
preds_regrouped <- transform(preds_example, simple.season=ordered(
  c("winter","spring","summer","fall")[floor(((as.numeric(strftime(date, "%m"))+0)%%12)/3)+1], 
  c("winter","spring","summer","fall")))
aggregateSolute(preds_example, metadata=metadata_example, format="conc", 
                agg.by="simple.season", custom=preds_regrouped)

# with a custom prediction error correlation matrix
new_correlation_assumption <- getCormatFirstOrder(rho=0.9, 
time.step=as.difftime(1, units="days"), max.tao=as.difftime(10, units="days"))
aggregateSolute(preds_example, metadata=metadata_example, format="conc", agg.by="month",
                cormat.function=new_correlation_assumption)

## End(Not run)

McDowellLab/loadflex documentation built on May 8, 2019, 9:48 a.m.