View source: R/downscaleTrain.R
downscaleTrain | R Documentation |
Calibration of downscaling methods. Currently analogs, generalized linear models (GLM) and Neural Networks (NN) are available.
downscaleTrain(
obj,
method,
condition = NULL,
threshold = NULL,
model.verbose = TRUE,
predict = TRUE,
simulate = FALSE,
...
)
obj |
The object as returned by |
method |
Character string indicating the type of method/transfer function. Currently accepted values are |
condition |
Inequality operator to be applied to the given |
threshold |
Numeric value. Threshold used as reference for the condition. Default is NULL. If a threshold value is supplied with no specificaction of the argument |
model.verbose |
A logic value. Indicates wether the information concerning the model infered is limited to the essential information (model.verbose = FALSE) or a more detailed information (model.verbose = TRUE, DEFAULT). This is recommended when you want to save memory. Only operates for GLM. |
predict |
A logic value. Should the prediction on the training set should be returned? Default is TRUE. |
simulate |
A logic value indicating whether we want to simulate or not based on the GLM distributional parameters when prediting on the train set. Only relevant when perdicting with a GLM. Default to FALSE. |
... |
Optional parameters. These parameters are different depending on the method selected. Every parameter has a default value set in the atomic functions in case that no selection is wanted.
Everything concerning these parameters is explained in the section |
The function can downscale in both global and local mode, though not simultaneously. If there is perfect collinearity among predictors, then the matrix will not be invertible and the downscaling will fail. We recommend to get rid of the NaN/NA values before calling the function.
Analogs The optional parameters of this method are:
n.analogs
An integer. Number of analogs. Default is 4.
sel.fun
A string. Select a function to apply to the analogs selected for a given observation. Options are
"mean", "wmean" (i.e., weighted mean), "max", "min", "median", "prcXX"
(i.e., prc85 means the 85th percentile of the analogs values distribution). Default is "mean".
the function applied to the analogs values, (i.e., sel.fun = c("mean","max","min","median","prcXX"), with default "mean")
and the temporal window, (i.e., window = 0).
window
An integer. Window of days removed when selecting analogs.
If window = 7, then 7 days after the observation date and the 7 days before the observation date are removed. Default is 0.
n.random
An integer. Choose N random analogs among the closest n.analogs. Default is NULL.
More information can be found in analogs.train
Generalized Linear Models (GLM)
The optional parameters depends on the fitting
optional parameter:
fitting
A string indicating the types of objective functions and how to fit the linear model.
fitting = NULL
In this case the generalized linear model uses the glm
function to fit the linear model.
This is the default option.
The optional parameters when fitting = NULL are:
family
A string indicating a description of the error distribution. Options are
family = c("gaussian","binomial","Gamma","inverse.gaussian","poisson","quasi","quasibinomial","quasipoisson").
The links can be also specified and can be found in family
.
na.action
A function which indicates what should happen when the data contain NAs.
The default is set by the na.action setting of options, and is na.fail if that is unset.
The ‘factory-fresh’ default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.
fitting = "stepwise"
Indicates a stepwise regression via glm
and step
.
The optional parameters are the same than for fitting = NULL. Stepwise can be performed backward or forward, as well as we can limit
the number of steps. This can be done by the additional optional parameter stepwise.arg
, which is a list contatining two parameters that belong
to step
: steps and direction. An example would be: stepwise.arg = list(steps = 5, direction = "backward"). Default is NULL what indicates an unlimited forward stepwise search.
fitting = c("L1","L2","L1L2","gLASSO")
. These four options refer to ridge regression (L1 penalty), lasso regression (L2 penalty),
elastic-net regression (L1L2 penalty) and group Lasso regression (group L2 penalty). The model is fitted via
glmnet
and the corresponding penalties are found via cv.glmnet
. This function glmnet
forces by default to standardize predictors, however we have changed it to standardize = FALSE, and standardization should be done prior to
the downscaling process.
The optional parameters when fitting = c("L1","L2","L1L2","gLASSO") are:
family
A string indicating a description of the error distribution. Options are
family = c("gaussian","binomial","Gamma","inverse.gaussian","poisson","quasi","quasibinomial","quasipoisson").
The links CAN NOT be specified as the glmnet
has not been programmed to handle links.
However, the default ones can be found in family
. If fitting = "gLASSO" then family must be "mgaussian".
offset
A vector of length nobs that is included in the linear predictor (a nobs x nc matrix for the "multinomial" family).
Useful for the "poisson" family (e.g. log of exposure time), or for refining a model by starting at a current fit.
Default is NULL. If supplied, then values must also be supplied to the predict function.
There are two things to consider. 1) If family = "binomial" then type = "response" when predicting values. 2) Except for fitting = "MP", for the rest of the fitting options, the parameter site must be TRUE, unless we want a gLASSO, in this case site must be FALSE.
Neural Networks
Neural network is based on the library deepnet. The optional parameters corresponds to those in nn.train
and are: initW
= NULL, initB
= NULL, hidden
= c(10), activationfun
= "sigm", learningrate
= 0.001, momentum
= 0.5,
learningrate_scale
= 1, output
= "sigm", numepochs
= 5000, batchsize
= 100, hidden_dropout
= 0, visible_dropout
= 0. The values indicated are the default values.
Help
If there are still doubts about the optional parameters despite the description here, we encourage to look for further details in the atomic functions:
analogs.train
, glm.train
and nn.train
.
A list of objects that contains the prediction on the train dataset and the model.
pred
: An object with the same structure as the predictands input parameter, but with pred$Data being the predictions and not the observations.
model
: A list with the information of the model: method, coefficients, fitting ...
downscaleR Wiki for downscaling seasonal forecasting and climate projections.
J. Bano-Medina
Other downscaling.functions:
downscaleCV()
,
downscaleChunk()
,
downscalePredict()
,
downscale()
# Loading data
require(transformeR)
require(climate4R.datasets)
data("VALUE_Iberia_tas")
y <- VALUE_Iberia_tas
data("NCEP_Iberia_hus850", "NCEP_Iberia_psl", "NCEP_Iberia_ta850")
x <- makeMultiGrid(NCEP_Iberia_hus850, NCEP_Iberia_psl, NCEP_Iberia_ta850)
# Preparing the predictors
data <- prepareData(x = x, y = y, spatial.predictors = list(v.exp = 0.95))
# Training downscaling methods
model.analogs <- downscaleTrain(data, method = "analogs", n.analogs = 1)
model.regression <- downscaleTrain(data, method = "GLM",family = gaussian)
model.nnets <- downscaleTrain(data, method = "NN", hidden = c(10,5), output = "linear")
# Plotting the results for station 5
plot(y$Data[,5],model.analogs$pred$Data[,5], xlab = "obs", ylab = "pred")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.