aggregateGrid: Flexible grid aggregation along selected dimensions

View source: R/aggregateGrid.R

aggregateGridR Documentation

Flexible grid aggregation along selected dimensions

Description

Aggregates a grid along the target dimensions using user-defined functions.

Usage

aggregateGrid(
  grid,
  aggr.mem = list(FUN = NULL),
  aggr.d = list(FUN = NULL),
  aggr.m = list(FUN = NULL),
  aggr.y = list(FUN = NULL),
  aggr.s = list(FUN = NULL, season = NULL),
  aggr.spatial = list(FUN = NULL),
  aggr.lat = list(FUN = NULL),
  weight.by.lat = TRUE,
  aggr.lon = list(FUN = NULL),
  aggr.loc = list(FUN = NULL),
  parallel = FALSE,
  max.ncores = 16,
  ncores = NULL
)

Arguments

grid

a grid or multigrid to be aggregated.

aggr.mem

Same as aggr.d, but indicating the function for computing the member aggregation.

aggr.d

Daily aggregation function (for sub-daily data only). A list indicating the name of the aggregation function in first place, and other optional arguments to be passed to the aggregation function. To be on the safe side, the function in FUN should be always indicated as a character string. See the examples.

aggr.m

Same as aggr.d, but indicating the monthly aggregation function.

aggr.y

Same as aggr.d, but indicating the annual aggregation function.

aggr.s

Same as aggr.d, but indicating the seasonal aggregation function. The season can be indicated as shown in this example: aggr.s = list(FUN = list("mean", na.rm = TRUE), season = c(12,1,2))

aggr.spatial

Same as aggr.d, but indicating the aggregation function in case of rectangular domains to be aggregated as a unique time series grid (or multimember time series grid.)

aggr.lat

Same as aggr.d, indicating the aggregation function to be applied along latitude only.

weight.by.lat

Logical. Should latitudinal averages be weighted by the cosine of latitude?. Default to TRUE. Ignored if no aggr.lat or aggr.spatial function is indicated, or a function different from "mean" is applied.

aggr.lon

Same as aggr.lat, but for longitude.

aggr.loc

Same as aggr.d, indicating the aggregation function to be applied along the loc dimension.

parallel

Logical. Should parallel execution be used?

max.ncores

Integer. Upper bound for user-defined number of cores.

ncores

Integer number of cores used in parallel computation. Self-selected number of cores is used when ncpus = NULL (the default), or when maxcores exceeds the default ncores value.

Details

Aggregation function definition

The aggregation functions are specified in the form of a named list of the type FUN = "function", ..., where ... are further arguments passes to FUN. This allows for a flexible definition of aggregation functions, that are internally passes to tapply. Note that the name of the function is indicated as a character string.

Member aggregation

The function preserves the metadadata associated with member information (i.e. initialization dates and member names) after aggregation. In addition, an attribute indicating the member aggregation function is added to the Variable component.

Temporal aggregation

To annually or monthly aggregate data, aggr.d and/or aggr.m functions are specified. Aggregations need to be specified from bottom to top, so for instance, if the data in the grid is sub-daily and aggr.d is not specified, an error will be given for monthly or annual aggregation attempts. Similarly, annual aggregations require a previous specification of daily and monthly aggregation, when applicable. Special attributes in the Variable component indicate the aggregation undertaken.

In order to preserve the information of the season in annual aggregations, the attribute season is added to the Dates component.

Value

A grid or multigrid aggregated along the chosen dimension(s).

Parallel Processing

Parallel processing is enabled using the parallel package. Parallelization is undertaken by a FORK-type parallel socket cluster formed by ncores. If ncores is not specified (default), ncores will be one less than the autodetected number of cores. The maximum number of cores used for parallel processing can be set with the max.ncores argument, although this will be reset to the auto-detected number of cores minus 1 if this number is exceeded. Note that not all code, but just some critical loops within the function are parallelized.

In practice, parallelization does not always result in smaller execution times, due to the parallel overhead. However, parallel computing may potentially provide a significant speedup for the particular case of large multimember datasets or large grids.

Parallel computing is currently not available for Windows machines.

Author(s)

M. Iturbide, M. de Felice, J. Bedia

Examples


require(climate4R.datasets)
data("CFS_Iberia_tas")
## Aggregating members
# Ensemble mean
mn <- aggregateGrid(grid = CFS_Iberia_tas, aggr.mem = list(FUN = "mean", na.rm = TRUE))
require(visualizeR)
spatialPlot(climatology(mn, by.member = FALSE),
                backdrop.theme = "coastline", main = "Ensemble mean tmax climatology")
# Ensemble 90th percentile
 ens90 <- aggregateGrid(grid = CFS_Iberia_tas,
                        aggr.mem = list(FUN = quantile, probs = 0.9, na.rm = TRUE))
spatialPlot(climatology(ens90, by.member = FALSE),
                backdrop.theme = "coastline", main = "Ensemble 90th percentile tmax climatology")

## Monthly aggregation
monthly.mean <- aggregateGrid(CFS_Iberia_tas, aggr.m = list(FUN = mean, na.rm = TRUE))
spatialPlot(climatology(monthly.mean), backdrop.theme = "coastline",
                main = "Mean tmax climatology")

## Several dimensions ca be aggregated in one go:
mm.mean <- aggregateGrid(CFS_Iberia_tas,
                         aggr.mem = list(FUN = "mean", na.rm = TRUE),
                         aggr.m = list(FUN = "mean", na.rm = TRUE))


SantanderMetGroup/transformeR documentation built on Nov. 25, 2024, 1:25 p.m.