aggregateSolute: Aggregate loads by the time periods specified by the user

View source: R/aggregateSolute.R

aggregateSoluteR Documentation

Aggregate loads by the time periods specified by the user

Description

This will aggregate the total loads or mean concentrations per aggregation interval, specified by agg.by. The time frame specified by agg.by can be "unit," "day," "month," "water year," "calendar year," "total," or the name of a column in newdata that can be used to group the data.

Usage

aggregateSolute(preds, metadata, format = c("conc", "flux rate"),
  agg.by = c("unit", "day", "month", "water year", "calendar year",
  "total", "[custom]"), dates, custom = NA, na.rm = FALSE,
  attach.units = FALSE, agg.cols = TRUE, count = TRUE, ...)

Arguments

preds

Either a vector of predicted instantaneous fluxes or concentrations or a data.frame containing the columns "fit", "se.pred", and "date"

metadata

A metadata object describing the model

format

character. The desired format of the aggregated values. If "conc", preds is assumed to already be formatted as "conc". If "flux" or "flux rate", preds is assumed to already be formatted as "flux rate". If preds has a "units" attribute, that attribute is checked for consistency with format and metadata, but if preds has no "units" attribute then no such checks can be made.

agg.by

character. The date interval or grouping column by which to aggregate. If agg.by="unit", values will be returned unaggregated but in the standard post-aggregation format. If agg.by is one of "day", "month", "water year", or "calendar year", the dates vector will be split into periods corresponding to those intervals, and the flux or concentration will be computed for each period. If agg.by="total", dates will be ignored and the entire vector preds will be aggregated. If agg.by="[custom]", aggregation will occur for each unique value in dates.

dates

A vector, of the same length as preds, containing the dates to aggregate over. This data may also be given as a column named "date" in preds when preds is a data.frame.

custom

An optional data.frame of one or more columns each containing factors or other labels on which to aggregate. The columns to be used are set by agg.by.

na.rm

logical. Should NA values be ignored during aggregation (TRUE), or should NA be returned for intervals that contain one or more NA predictions (FALSE)?

attach.units

logical. If true, units will be attached as an attribute of the second column of the returned data.frame.

agg.cols

logical. Should the output data.frame include a column or columns specifying the aggregation group/s for each row? TRUE is recommended.

count

logical. Should a count of the number of observations per group be included? For most values of agg.by, when count=TRUE there will be a new column called count.

...

Defunct and ignored arguments. Defunct arguments include 'se.preds', 'ci.agg', 'deg.free', 'ci.distrib', 'se.agg', and 'cormat.function'.

Details

This also calculates the uncertainty in the sum over a regular time series (loads) with known standard errors (loadsSEs) for each short-term load estimate.

The general equation for propagation of error in a sum of potentially autocorrelated values is:

sum_t(var(x[t])) + 2*sum_a,b(cov(x_a,a_t+l))

where we will assume something about the covariance matrix.

However, we will deviate from the above equation to accommodate the lognormal distribution of each flux prediction.

Value

A data.frame with 2+ columns. The first column or set of columns contains the aggregation period or custom aggregation unit and is named after the value of agg.by. The second contains the aggregate flux or concentration estimates and is named after the value of format. The values in the second column will be in the units specified by metadata.

Examples

## Not run: 
data(eg_metadata)
metadata_example <- updateMetadata(eg_metadata, dates="date")
preds_example <- data.frame(fit=abs(rnorm(365, 5, 2)),
  date=seq(as.Date("2018-05-15"), as.Date("2019-05-14"), by=as.difftime(1, units="days")))
aggregateSolute(preds_example, metadata=metadata_example, format="conc", agg.by="month")

# with a custom aggregation group
preds_regrouped <- transform(preds_example, simple.season=ordered(
  c("winter","spring","summer","fall")[floor(((as.numeric(strftime(date, "%m"))+0)%%12)/3)+1],
  c("winter","spring","summer","fall")))
aggregateSolute(preds_example, metadata=metadata_example, format="conc",
                agg.by="simple.season", custom=preds_regrouped)

## End(Not run)

USGS-R/loadflex documentation built on July 26, 2023, 9:54 p.m.