aggregate: Aggregate indicators

Description Usage Arguments Details Value Examples

View source: R/coin_aggregate.R

Description

Takes indicator data and a specified structure and hierarchically aggregates according to the structure specified in IndMeta. Uses a variety of aggregation methods as specified by agtype, which can be different for each level of aggregation (see agtype_by_level).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
aggregate(
  COIN,
  agtype = "arith_mean",
  agweights = NULL,
  dset = NULL,
  agtype_bylevel = NULL,
  agfunc = NULL,
  avail_limit = NULL,
  out2 = NULL
)

Arguments

COIN

COIN object

agtype

The type of aggregation method. One of either:

  • "arith_mean" - weighted arithmetic mean

  • "median" - weighted median

  • "geom_mean" - weighted geometric mean

  • "harm_mean" - weighted harmonic mean

  • "copeland" - weighted Copeland method

  • "custom" - a custom function - see agfunc

  • "mixed" - a different aggregation method for each level. In this case, aggregation methods are specified as any of the previous options using the agtype_bylevel argument.

agweights

The weights to use in the aggregation. This can either be: NULL, in which case it will use the weights that were attached to IndMeta and AggMeta in assemble() (if they exist), or A character string which corresponds to a named list of weights stored in .$Parameters$Weights. You can either add these manually or through rew8r(). E.g. entering agweights = "Original" will use the original weights read in on assembly. This is equivalent to agweights = NULL. Or, a data frame of weights to use in the aggregation.

dset

Which data set (contained in COIN object) to use

agtype_bylevel

A character vector with aggregation types for each level. Note that if this is specified, agtype must be specified as agtype = "mixed", otherwise agtype_by_level will be ignored.

agfunc

A custom function to use for aggregation if agtype = "custom", of the type y = f(x,w), where y is a scalar aggregated value and x and w are vectors of indicator values and weights respectively. Ensure that NAs are handled (e.g. set na.rm = T) if your data has missing values.

avail_limit

A data availability threshold, below which aggregation returns NA. This parameter is the fraction of data availability needed in a given aggregation group to return an aggregated score. Specified as either NULL (default, aggregation values are always returned if possible) or a value between 0 and 1 (below this value of data availability, NA will be returned). See Details. are ignored during aggregation, so that as long as there is at least one value in an aggregation group

out2

Where to output the results. If "COIN" (default for COIN input), appends to updated COIN, otherwise if "df" outputs to data frame.

Details

This function aggregates indicators according to the index structure specified in IndMeta. It will either use a single aggregation method for all aggregation levels (by specifying agtype) or can use a different aggregation method for each level of the index (see agtype_by_level). Aggregation methods are typically weighted (e.g. weighted means), and the weights for the aggregation are specified using the agweights argument.

By default, this function will aggregate wherever possible - generally this means that if at least one value is available for a given unit inside an aggregation group, it will return an aggregated score. Optionally, you can also specify a data availability threshold which will instead return NA if the data availability (within group and for each unit) falls below the threshold. For example, you may have four indicators inside a group, and you might want to only produce an aggregated score if data availability is at least 50% - this would be specified by avail_limit = 0.5. It is also possible to specify different data availability thresholds for different levels of the index, by specifying avail_limit as a vector which has one value for every aggregation level (the first value gives the threshold for the first aggregation, and so on up to the final level).

Value

An updated COIN containing the new aggregated data set at .$Data$Aggregated.

Examples

1
2
3
4
5
6
7
8
# assemble a COIN first
ASEM <- assemble(IndData = ASEMIndData, IndMeta = ASEMIndMeta, AggMeta = ASEMAggMeta)
# normalise the data
ASEM <- normalise(ASEM, dset = "Raw")
# aggregate the data
ASEM <- COINr::aggregate(ASEM, agtype="arith_mean", dset = "Normalised")
# check aggregated data set exists
stopifnot(!is.null(ASEM$Data$Aggregated))

COINr documentation built on Nov. 30, 2021, 9:06 a.m.