aggregateSolute: Aggregate loads by the time periods specified by the user
In USGS-R/loadflex: Models and Tools for Watershed Flux Estimates

aggregateSolute

R Documentation

Aggregate loads by the time periods specified by the user

Description

This will aggregate the total loads or mean concentrations per aggregation interval, specified by agg.by. The time frame specified by agg.by can be "unit," "day," "month," "water year," "calendar year," "total," or the name of a column in newdata that can be used to group the data.

Usage

aggregateSolute(preds, metadata, format = c("conc", "flux rate"),
  agg.by = c("unit", "day", "month", "water year", "calendar year",
  "total", "[custom]"), dates, custom = NA, na.rm = FALSE,
  attach.units = FALSE, agg.cols = TRUE, count = TRUE, ...)

Arguments

`preds`	Either a vector of predicted instantaneous fluxes or concentrations or a data.frame containing the columns "fit", "se.pred", and "date"
`metadata`	A metadata object describing the model
`format`	character. The desired format of the aggregated values. If "conc", preds is assumed to already be formatted as "conc". If "flux" or "flux rate", preds is assumed to already be formatted as "flux rate". If preds has a "units" attribute, that attribute is checked for consistency with `format` and `metadata`, but if preds has no "units" attribute then no such checks can be made.
`agg.by`	character. The date interval or grouping column by which to aggregate. If agg.by="unit", values will be returned unaggregated but in the standard post-aggregation format. If agg.by is one of "day", "month", "water year", or "calendar year", the dates vector will be split into periods corresponding to those intervals, and the flux or concentration will be computed for each period. If agg.by="total", `dates` will be ignored and the entire vector `preds` will be aggregated. If agg.by="[custom]", aggregation will occur for each unique value in `dates`.
`dates`	A vector, of the same length as preds, containing the dates to aggregate over. This data may also be given as a column named "date" in preds when preds is a data.frame.
`custom`	An optional data.frame of one or more columns each containing factors or other labels on which to aggregate. The columns to be used are set by `agg.by`.
`na.rm`	logical. Should NA values be ignored during aggregation (TRUE), or should NA be returned for intervals that contain one or more NA predictions (FALSE)?
`attach.units`	logical. If true, units will be attached as an attribute of the second column of the returned data.frame.
`agg.cols`	logical. Should the output data.frame include a column or columns specifying the aggregation group/s for each row? TRUE is recommended.
`count`	logical. Should a count of the number of observations per group be included? For most values of agg.by, when count=TRUE there will be a new column called count.
`...`	Defunct and ignored arguments. Defunct arguments include 'se.preds', 'ci.agg', 'deg.free', 'ci.distrib', 'se.agg', and 'cormat.function'.

Details

This also calculates the uncertainty in the sum over a regular time series (loads) with known standard errors (loadsSEs) for each short-term load estimate.

The general equation for propagation of error in a sum of potentially autocorrelated values is:

sum_t(var(x[t])) + 2*sum_a,b(cov(x_a,a_t+l))

where we will assume something about the covariance matrix.

However, we will deviate from the above equation to accommodate the lognormal distribution of each flux prediction.

Value

A data.frame with 2+ columns. The first column or set of columns contains the aggregation period or custom aggregation unit and is named after the value of agg.by. The second contains the aggregate flux or concentration estimates and is named after the value of format. The values in the second column will be in the units specified by metadata.

Examples

## Not run: 
data(eg_metadata)
metadata_example <- updateMetadata(eg_metadata, dates="date")
preds_example <- data.frame(fit=abs(rnorm(365, 5, 2)),
  date=seq(as.Date("2018-05-15"), as.Date("2019-05-14"), by=as.difftime(1, units="days")))
aggregateSolute(preds_example, metadata=metadata_example, format="conc", agg.by="month")

# with a custom aggregation group
preds_regrouped <- transform(preds_example, simple.season=ordered(
  c("winter","spring","summer","fall")[floor(((as.numeric(strftime(date, "%m"))+0)%%12)/3)+1],
  c("winter","spring","summer","fall")))
aggregateSolute(preds_example, metadata=metadata_example, format="conc",
                agg.by="simple.season", custom=preds_regrouped)

## End(Not run)

USGS-R/loadflex documentation built on Dec. 10, 2024, 10:35 p.m.