knn_space_time_disagg: Spatial and temporal disaggregation of flow data

View source: R/knn_space_time_disagg.R

knn_space_time_disaggR Documentation

Spatial and temporal disaggregation of flow data

Description

knn_space_time_disagg() disaggregates annual flow data spatially and temporally (to monthly), using a spatial and monthly flow pattern selected from an "index year". The index year is selected using a k nearest-neighbor approach, (Knowak et al., 2010).

Usage

knn_space_time_disagg(
  ann_flow,
  ann_index_flow,
  mon_flow,
  start_month,
  nsim = 1,
  scale_sites = TRUE,
  index_years = NULL,
  k_weights = knn_params_default(nrow(ann_index_flow)),
  random_seed = NULL
)

Arguments

ann_flow

A 2-column matrix, with years in column 1, and an annual flow in column 2. This is the annual flow that needs to be disaggregated.

ann_index_flow

A 2-column matrix, with years in column 1, and an annual flow in column 2. This is the index gage that the flows in ann_flow will be compared to. After the comparison, the nearest neighbor years are selected.

mon_flow

Monthly natural flow. Used for spatially and temporally disaggregating the flow data (ann_flow) based on the index year selected from ann_index_flow, by knn_get_index_year(). Each column represents a different site, and the annual flow at the index gage will be disaggregated to each of these sites at he monthly level. If there are three columns in this matrix, then the values in ann_flow will be disaggregated to three sites. mon_flow should have the same years as ann_index_flow, therefore, it should contain 12 times more rows than ann_index_flow. The flow data in mon_flow should also contain values for the same years as ann_index_flow, though there are no checks performed to check this, since this is expected to be a dimensionless matrix. mon_flow can have named or unnamed columns. If they are named, the names are preserved in the disaggregated output. If they are unnamed, the columns are renamed S1-SN where N is the number of columns.

start_month

The start month of the mon_flow as an integer. 1 = January, 2 = February, etc. Used to correctly label the output data.

nsim

Number of times to repeat the space/time disaggregation.

scale_sites

TRUE/FALSE - scale all the sites. Otherwise, a numeric vectotr specifying the site numbers (column indices), that will scale the index year's volume based on the annual flow being disaggregated. The remaining sites will select the index year directly. See Details.

index_years

Optional. If specified, these index years will be used instead of selecting years based on weighted sampling via knn_get_index_year().

k_weights

A knn_params() object. By default, it uses knn_params_default().

random_seed

A single integer or NULL. If an integer, then it is used with set.seed() so reproducible results can be guaranteed.

Details

The method is described in detail in Knowak et al. (2010). The methodology disaggregates annual flow data (ann_flow) by selecting an index year from ann_index_flow using knn_get_index_year(). After the index year is selected, values from ann_flow are disaggregated spatially, and temporally based on mon_flow. The spatial pattern is reflected by including different sites as columns in mon_flow, and the monthly disaggregation, uses the monthly pattern in mon_flow to disaggregate the data temporally. Summability is preserved using this method, if the values selected in mon_flow are scaled and if the columns (or a subset of columns) in mon_flow sum together to equal ann_index_flow.

scale_sites: In some cases, it is desirable to select monthly flow directly, instead of scaling it. This can be performed by only scaling certain sites, using scale_sites. scale_sites should be a boolean, or a vector of numerics. If TRUE, then all sites are scaled. If FALSE, all sites monthly values are selected directly. Otherwise, scale_sites should be a vector of the sites that should be scaled, based on their column index from mon_flow. For example, if mon_flow is a matrix with 4 columns, then setting mon_flow to c(2, 3) will scale the values in sites 2 and 3 (columns 2 and 3), while selecting flow values directly in sites 1 and 2.

Value

A knnst object.

Author(s)

Ken Nowak

References

Nowak, K., Prairie, J., Rajagopalan, B., Lall, U. (2010). A nonparametric stochastic approach for multisite disaggregation of annual to daily streamflow. Water Resources Research.

See Also

knnst, knn_get_index_year()

Examples


# a sample of three years of flow data
flow_mat <- cbind(c(2000, 2001, 2002), c(1400, 1567, 1325))
# made up historical data to use as index years
ind_flow <- cbind(1901:1980, rnorm(80, mean = 1500, sd = 300))
# make up monthly flow for two sites
mon_flow <- cbind(
  rnorm(80 * 12, mean = 20, sd = 5),
  rnorm(80 * 12, mean = 120, sd = 45)
)
knn_space_time_disagg(flow_mat, ind_flow, mon_flow, 1, scale_sites = 1:2)


rabutler-usbr/knnstdisagg documentation built on Sept. 14, 2023, 2:47 p.m.