aggregateGrid: Flexible grid aggregation along selected dimensions
In SantanderMetGroup/transformeR: A climate4R package for general climate data manipulation and transformation

aggregateGrid

R Documentation

Flexible grid aggregation along selected dimensions

Description

Aggregates a grid along the target dimensions using user-defined functions.

Usage

aggregateGrid(
  grid,
  aggr.mem = list(FUN = NULL),
  aggr.d = list(FUN = NULL),
  aggr.m = list(FUN = NULL),
  aggr.y = list(FUN = NULL),
  aggr.s = list(FUN = NULL, season = NULL),
  aggr.spatial = list(FUN = NULL),
  aggr.lat = list(FUN = NULL),
  weight.by.lat = TRUE,
  aggr.lon = list(FUN = NULL),
  aggr.loc = list(FUN = NULL),
  parallel = FALSE,
  max.ncores = 16,
  ncores = NULL
)

Arguments

`grid`	a grid or multigrid to be aggregated.
`aggr.mem`	Same as `aggr.d`, but indicating the function for computing the member aggregation.
`aggr.d`	Daily aggregation function (for sub-daily data only). A list indicating the name of the aggregation function in first place, and other optional arguments to be passed to the aggregation function. To be on the safe side, the function in `FUN` should be always indicated as a character string. See the examples.
`aggr.m`	Same as `aggr.d`, but indicating the monthly aggregation function.
`aggr.y`	Same as `aggr.d`, but indicating the annual aggregation function.
`aggr.s`	Same as `aggr.d`, but indicating the seasonal aggregation function. The season can be indicated as shown in this example: aggr.s = list(FUN = list("mean", na.rm = TRUE), season = c(12,1,2))
`aggr.spatial`	Same as `aggr.d`, but indicating the aggregation function in case of rectangular domains to be aggregated as a unique time series grid (or multimember time series grid.)
`aggr.lat`	Same as `aggr.d`, indicating the aggregation function to be applied along latitude only.
`weight.by.lat`	Logical. Should latitudinal averages be weighted by the cosine of latitude?. Default to `TRUE`. Ignored if no `aggr.lat` or `aggr.spatial` function is indicated, or a function different from `"mean"` is applied.
`aggr.lon`	Same as `aggr.lat`, but for longitude.
`aggr.loc`	Same as `aggr.d`, indicating the aggregation function to be applied along the loc dimension.
`parallel`	Logical. Should parallel execution be used?
`max.ncores`	Integer. Upper bound for user-defined number of cores.
`ncores`	Integer number of cores used in parallel computation. Self-selected number of cores is used when `ncpus = NULL` (the default), or when `maxcores` exceeds the default `ncores` value.

Details

Aggregation function definition

The aggregation functions are specified in the form of a named list of the type FUN = "function", ..., where ... are further arguments passes to FUN. This allows for a flexible definition of aggregation functions, that are internally passes to tapply. Note that the name of the function is indicated as a character string.

Member aggregation

The function preserves the metadadata associated with member information (i.e. initialization dates and member names) after aggregation. In addition, an attribute indicating the member aggregation function is added to the Variable component.

Temporal aggregation

To annually or monthly aggregate data, aggr.d and/or aggr.m functions are specified. Aggregations need to be specified from bottom to top, so for instance, if the data in the grid is sub-daily and aggr.d is not specified, an error will be given for monthly or annual aggregation attempts. Similarly, annual aggregations require a previous specification of daily and monthly aggregation, when applicable. Special attributes in the Variable component indicate the aggregation undertaken.

In order to preserve the information of the season in annual aggregations, the attribute season is added to the Dates component.

Value

A grid or multigrid aggregated along the chosen dimension(s).

Parallel Processing

Parallel processing is enabled using the parallel package. Parallelization is undertaken by a FORK-type parallel socket cluster formed by ncores. If ncores is not specified (default), ncores will be one less than the autodetected number of cores. The maximum number of cores used for parallel processing can be set with the max.ncores argument, although this will be reset to the auto-detected number of cores minus 1 if this number is exceeded. Note that not all code, but just some critical loops within the function are parallelized.

In practice, parallelization does not always result in smaller execution times, due to the parallel overhead. However, parallel computing may potentially provide a significant speedup for the particular case of large multimember datasets or large grids.

Parallel computing is currently not available for Windows machines.

Author(s)

M. Iturbide, M. de Felice, J. Bedia

Examples


require(climate4R.datasets)
data("CFS_Iberia_tas")
## Aggregating members
# Ensemble mean
mn <- aggregateGrid(grid = CFS_Iberia_tas, aggr.mem = list(FUN = "mean", na.rm = TRUE))
require(visualizeR)
spatialPlot(climatology(mn, by.member = FALSE),
                backdrop.theme = "coastline", main = "Ensemble mean tmax climatology")
# Ensemble 90th percentile
 ens90 <- aggregateGrid(grid = CFS_Iberia_tas,
                        aggr.mem = list(FUN = quantile, probs = 0.9, na.rm = TRUE))
spatialPlot(climatology(ens90, by.member = FALSE),
                backdrop.theme = "coastline", main = "Ensemble 90th percentile tmax climatology")

## Monthly aggregation
monthly.mean <- aggregateGrid(CFS_Iberia_tas, aggr.m = list(FUN = mean, na.rm = TRUE))
spatialPlot(climatology(monthly.mean), backdrop.theme = "coastline",
                main = "Mean tmax climatology")

## Several dimensions ca be aggregated in one go:
mm.mean <- aggregateGrid(CFS_Iberia_tas,
                         aggr.mem = list(FUN = "mean", na.rm = TRUE),
                         aggr.m = list(FUN = "mean", na.rm = TRUE))

SantanderMetGroup/transformeR documentation built on Nov. 25, 2024, 1:25 p.m.