prepareData: Configuration of data for downscaling

View source: R/prepareData.R

prepareDataR Documentation

Configuration of data for downscaling

Description

Configuration of data for flexible downscaling experiment definition

Usage

prepareData(
  x,
  y,
  global.vars = NULL,
  combined.only = TRUE,
  spatial.predictors = NULL,
  local.predictors = NULL,
  extended.predictors = NULL
)

Arguments

x

A grid (usually a multigrid) of predictor fields

y

A grid (usually a stations grid, but not necessarily) of observations (predictands)

global.vars

An optional character vector with the short names of the variables of the input x multigrid to be retained as global predictors (use the getVarNames helper if not sure about variable names). This argument just produces a call to subsetGrid, but it is included here for better flexibility in downscaling experiments (predictor screening...). For instance, it allows to use some specific variables contained in x as local predictors and the remaining ones, specified in subset.vars, as either raw global predictors or to construct the combined PC.

combined.only

Optional, and only used if spatial.predictors parameters are passed. Should the combined PC be used as the only global predictor? Default to TRUE. Otherwise, the combined PC constructed with which.combine argument in prinComp is append to the PCs of the remaining variables within the grid.

spatial.predictors

Default to NULL, and not used. Otherwise, a named list of arguments in the form argument = value, with the arguments to be passed to prinComp to perform Principal Component Analysis of the predictors grid (x). See Details on principal component analysis of predictors.

local.predictors

Default to NULL, and not used. Otherwise, a named list of arguments in the form argument = value, with the following arguments:

  • vars: names of the variables in x to be used as local predictors

  • fun: Optional. Aggregation function for the selected local neighbours. The aggregation function is specified as a list, indicating the name of the aggregation function in first place (as character), and other optional arguments to be passed to the aggregation function. For instance, to compute the average skipping missing values: fun = list(FUN= "mean", na.rm = TRUE). Default to NULL, meaning that no aggregation is performed.

  • n: Integer. Number of nearest neighbours to use. If a single value is introduced, and there is more than one variable in vars, the same value is used for all variables. Otherwise, this should be a vector of the same length as vars to indicate a different number of nearest neighbours for different variables.

extended.predictors

This is a parameter related to the extreme learning machine and reservoir computing framework where input data is randomly projected into a new space of size n. Default to NULL, and not used. Otherwise, a named list of arguments in the form argument = value, with the following arguments:

  • n: A numeric value. Indicates the size of the random nonlinear dimension where the input data is projected.

  • module: A numeric value (Optional). Indicates the size of the mask's module. Belongs to a specific type of ELM called RF-ELM.

Details

Temporal consistency Note that x (predictors) and y predictands are checked for temporal consistency prior to downscaling. In case of partial temporal overlapping, both are internnaly intersected for exact temporal matching.

Principal Component Analysis Always that spatial.predictors is used, a combined PC will be returned (unless one single predictor is used, case in which no combination is possible). Note that the variables of the predictor grid used to construct the combined PC can be flexibly controlled through the optional argument subset.vars.

Value

A named list with components y (the predictand), x.global (global predictors, 2D matrix), x.local (local predictors, a list) and pca (prinComp output), and other attributes. See Examples.

Author(s)

J. Bedia, D. San-Martín and J.M. Gutiérrez

See Also

downscaleR Wiki for preparing predictors for downscaling and seasonal forecasting.

Other downscaling.helpers: predictor.nn.indices(), predictor.nn.values(), prepareNewData()

Examples


require(transformeR)
# Loading data
require(climate4R.datasets)
data("VALUE_Iberia_tas")
y <- VALUE_Iberia_tas 
data("NCEP_Iberia_hus850", "NCEP_Iberia_psl", "NCEP_Iberia_ta850")
x <- makeMultiGrid(NCEP_Iberia_hus850, NCEP_Iberia_psl, NCEP_Iberia_ta850)
# Raw data
data <- prepareData(x = x, y = y)
# Using PCs as predictors. Number of EOFS: 10,5,5 for the 3 input variables
data <- prepareData(x = x, y = y, spatial.predictors = list(n.eofs = c(10,5,5)))
# Using joined PCs as predictors. Explained variance 95%
data <- prepareData(x = x, y = y, 
spatial.predictors = list(v.exp = 0.95, which.combine =getVarNames(x)))
# Using local predictors: the 4 closest gridboxes
data <- prepareData(x = x, y = y,local.predictors = list(n=4, vars = getVarNames(x)))
# Using joined PCs and local predictors: the 4 closest gridboxes
data <- prepareData(x = x, y = y,local.predictors = list(n=4, vars = getVarNames(x)),
spatial.predictors = list(v.exp = 0.95, which.combine =getVarNames(x)))


SantanderMetGroup/downscaleR documentation built on Nov. 16, 2024, 1:35 a.m.