# ProbForecastGOP: Probabilistic Weather Forecasts using the GOP method In ProbForecastGOP: Probabilistic weather forecast using the GOP method

## Description

This function generates probabilistic forecasts of weather fields using the Geostatistical Output Perturbation method as described by Gel, Raftery and Gneiting (2004).

## Usage

 1 2 3 4 ProbForecastGOP(day, obs, forecast, id, coord1, coord2, cut.points=NULL, max.dist=NULL, nbins=300, variog.model="exponential", max.dist.fit=NULL, init.val=NULL, fix.nugget=FALSE, coord1.grid, coord2.grid, forecast.grid, n.sim=99, out="SIM", n.displ=4, qt.displ=c(10,50,90)) 

## Arguments

 day numeric vector containing the day of observation. obs numeric vector containing the observed weather quantity. forecast numeric vector containing the forecasted weather quantity. id vector with the id of the metereological stations. coord1 vector containing the longitudes of the metereological stations. coord2 vector containing the latitudes of the metereological stations. cut.points numeric vector containing the cutpoints used for variogram binning. max.dist a numerical value giving the upper bound for the distance considered in the variogram computation. nbins a numerical value giving the number of bins for variogram binning. variog.model character string giving the name of the parametric model to fit to the empirical variogram. Implemented models are exponential, spherical, gauss, gencauchy, and matern. max.dist.fit number giving the maximum distance considered when fitting the variogram. init.val numeric vector giving the initial values for the parameters in the variogram model. The number of initial values to be entered depends on the parametric model specified. If the variog.model specified is exponential, spherical or gauss, then the parameters required are, in order, the nugget effect, the variance and the range. If the variog.model specified is gencauchy, the parameters required are, in order, the nugget effect, the variance, the range, the smoothness parameter a and the long-range parameter b. If the variog.model specified is matern the parameters required are, in order, the nugget effect, the variance, the range, and the smoothness parameter a. For more details on the valid range for the parameters and for the equation of the variogram models listed above, look below at the section "Details". fix.nugget logical field indicating whether the nugget should be considered fixed or not. If TRUE the nugget effect will be assumed to be constant, and a value for the fixed nugget effect can be also provided. If the value provided is different from the one entered in the init.val field, then the value of the nugget effect is taken to be the one entered in the init.val field. If FALSE the nugget effect will be estimated along with the other parameters in the variogram model. coord1.grid numeric vector containing the longitudes of the grid points for the forecast field. coord2.grid numeric vector containing the latitudes of the grid points for the forecast field. forecast.grid numeric vector containing the forecast grid. n.sim number of realizations to be simulated. out character string indicating the output produced by the function. Possible values are VARIOG, FIT and SIM. If VARIOG is specified, the function computes the empirical variogram of the forecast errors averaged over time and displays a plot of the empirical variogram. If FIT is specified, the function, in addition, fits a parametric model to the empirical variogram, returns estimates of the parameters and display both the empirical and the fitted variogram. If SIM is specified, the function simulates also realizations of the weather random field using the parametric variogram model specified. The function displays the simulated weather fields and some specified percentiles. n.displ number of realizations to be displayed on screen. qt.displ numeric vector containing the quantiles to be displayed.

## Details

This function generates probabilistic weather field forecasts by generating ensemble members using the Geostatistical Output Perturbation method presented by Gel, Raftery and Gneiting (2004).

The Geostatistical Output Perturbation method consists of the following steps:

1. determination of the residuals, by regressing the forecasts on the observed weather quantity and computation of their empirical variogram.

2. estimation of the parameters of the variogram model fitted to the empirical variogram.

3. simulation of weather random fields using the estimated parametric variogram model.

- Computation of the empirical variogram of the residuals

The empirical variogram of the residuals is calculated by determining, for each day, the distance among all pairs of stations that have been observed in the same day and by calculating for each day the sum of all the squared differences in the residuals within each bin. These sums are then averaged over time, with weights for each bin given by the sum over time of the number of pairs of stations within the bin.

The formula used is:

γ(h) = ∑_d \frac{1}{2N_{(h,d)}} (∑_h (Y(i+h,d)-Y(i,d))^2)

- Estimation of the parameters of the parametric variogram model fitted to the empirical variogram.

The function estimates the parameters in the variogram model by minimizing the weighted least-square loss function

LOSS(θ) = ∑_k n_k [ ≤ft( \frac{\hat{γ_k} - γ_k(θ)}{γ_k(θ)} \right)^2 ]

where n_k is the number of pairs contributing to the variogram computation in the kth bin, \hat{γ_k} is the value of the empirical variogram in the kth bin and γ_k(θ) is the value of the estimated parametric variogram model at the midpoint of the kth bin.

- Parametric variogram models -

The parametric variogram models implemented are: exponential, spherical, gaussian, generalized Cauchy and Whittle-Matern.

- exponential

The equation of the exponential variogram with parameters the nugget effect, ρ, the variance, σ^{2}, and the range, r, is given by:

γ(d) = ρ+σ^{2} \cdot (1-exp(- \frac{d}{r}))

where d is the distance between the two locations and γ(d) is the value of the exponential variogram at distance d. The nugget effect, ρ, is a non-negative number, while the variance, σ^2, and the range, r, are positive numbers.

- spherical

The equation of the spherical variogram with parameters the nugget effect, ρ, the variance, σ^2, and the range, r, is given by:

γ(d) = ρ+σ^{2} \cdot (\frac{3}{2} \cdot\frac{d}{r}-\frac{1}{2} \cdot \frac{d^3}{r^3})

where d is the distance between the two locations and γ(d) is the value of the spherical variogram at distance d. The nugget effect, ρ, is a non-negative number, while the variance, σ^2, and the range, r, are positive numbers.

- gauss

The equation of the gaussian variogram with parameters the nugget effect, ρ, the variance, σ^2, and the range, r, is given by:

γ(d) = ρ+σ^{2} \cdot (1-exp(- \frac{d^2}{r^2} ))

where d is the distance between the two locations and γ(d) is the value of the gaussian variogram at distance d. The nugget effect, ρ, is a non-negative number, while the variance, σ^2, and the range, r, are positive numbers.

- gencauchy

The equation of the generalized Cauchy variogram with parameters the nugget effect, ρ, the variance, σ^2, the range, r, the smoothness parameter, a, and the long-range parameter, b, is given by:

γ(d) = ρ+σ^{2} \cdot (1-(1+\frac{d^a}{r^a})^{- \frac{b}{a}})

where d is the distance between the two locations and γ(d) is the value of the generalized Cauchy variogram at distance d. The nugget effect, ρ is a non-negative number, the variance, σ^2 and the range, r, are positive numbers, the smoothness parameter, a, is a number in (0,2], and the long-range parameter, b, is a positive number.

- matern

The equation of the Whittle-Matern variogram with parameters the nugget effect, ρ, the variance, σ^2, the range, r, and the smoothness parameter, a, is given by:

γ(d) = ρ+σ^{2} \cdot (1-(\frac{2^{1-a}}{Γ(a)}\cdot \frac{d^a}{r^a} \cdot K_{a}(\frac{d}{r}))

where d is the distance between the two locations, γ(d) is the value of the Whittle-Matern variogram at distance d, Γ is the gamma function and K_a is the Bessel function of the third kind with parameter a. The nugget effect, ρ, is a non-negative number, the variance, σ^2, the range, r, and the smoothness parameter, a, are positive numbers.

- Defaults-

If the vector with the cutpoints is not specified, the cutpoints are determined so that there are nbins bins with approximately the same number of pairs per bin. If both the vector with the cutpoints and the number of bins, nbins, are unspecified, the function by default determines the cutpoints so that there are 300 bins with approximately the same number of pairs per bin.

The default value for the maximum distance considered in the variogram computation is the 90-th percentile of the distances between all stations.

The default value for the maximum distance is \frac{1}{2 \cdot √{2}} times the maximum distance considered in the empirical variogram.

Default for the initial values of the parameters is NULL, in which case the initial values for the parameters are determined internally and depend on the empirical variogram values. By default, the nugget effect is not considered constant. Thus, the default value for the fix.nugget field is FALSE.

The default for out is SIM: by default, 99 weather random fields are simulated, and 4 of them are displayed. If no vector of percentiles, is provided, the 10th, 50th and 90th percentiles are determined and displayed.

## Value

The function returns both a numerical and a graphical summary. The graphical and numerical summary provided depend on the specification given in the field out.

The graphical summary consists of a plot of the empirical variogram, if out is equal to VARIOG.

It consists of a plot of the empirical variogram with superimposed the fitted parametric model, if out is equal to FIT.

If out is equal to SIM, the graphical summary consists of several plots: a plot of the empirical variogram with superimposed the fitted parametric model, plots of the simulated weather fields, and plots of specified percentiles of the weather random field. The plots are printed on multiple pages, and before displaying each page, the user will be asked for input.

The numerical summary include the following:

 bias.coeff additive and multiplicative bias obtained by linear regression of the forecasted weather quantity on the observed weather quantity. Always provided. se.bias.coeff standard errors of the bias coefficients. Always provided. res.var variance of the forecast errors. Always provided. bin.midpoints numeric vector with midpoints of the bins used in the empirical variogram computation. Always provided. number.pairs numeric vector with the number of pairs per bin. Always provided. empir.variog numeric vector with the empirical variogram values. Always provided. model name of the parametric model fitted to the empirical variogram. Provided only if out is equal to FIT or SIM. nugget Estimate of the nugget effect. Provided only if out is equal to FIT or SIM. variance Estimate of the variance. Provided only if out is equal to FIT or SIM. range Estimate of the range. Provided only if out is equal to FIT or SIM. additional.par Numeric vector with the estimates of the additional parameters required by the parametric variogram model (only if the model specified is either whittlematern or gencauchy). This is provided only if out is equal to FIT or SIM. sim.fields 3-dimensional array where each layer contains the values of the simulated weather field at the gridded locations. Provided only if out is equal to SIM. pct.fields 3-dimensional array where each layer contains the specified percentiles of the weather field. Provided only if out is equal to SIM.

## Note

Computation of the empirical variogram might require substantial computing time. Hence, it is advisable to run the function once with out equal to VARIOG, save part of the output and use it as input for the Variog.fit function to estimate a parametric variogram model. Finally, Field.sim will simulate realizations of the weather field from the specified geostatistical model.

## Author(s)

Berrocal, V. J. veroberrocal@gmail.com, Raftery, A. E., Gneiting, T., Gel, Y.

## References

Gel, Y., Raftery, A. E., Gneiting, T. (2004). Calibrated probabilistic mesoscale weather field forecasting: The Geostatistical Output Perturbation (GOP) method (with discussion). Journal of the American Statistical Association, Vol. 99 (467), 575–583.

Gel, Y., Raftery, A. E., Gneiting, T., Berrocal, V. J. (2004). Rejoinder. Journal of the American Statistical Association, Vol. 99 (467), 588–590.

Cressie, N. A. C. (1993). Statistics for Spatial Data (revised ed.). Wiley: New York.

Gneiting, T., Schlather, M. (2004). Stochastic models that separate fractal dimension and the Hurst effect. SIAM Review, 46, 269–282.

Stein, M. L. (1999). Interpolation of Spatial Data - Some Theory for Kriging. Springer-Verlag: New York.

Schlather, M. (2001). Simulation and analysis of Random Fields. R News 1(2), 18–20.

Nychka, D. (2004). The fields package. Available at: http:lib.stat.cmu.edu/R/CRAN/doc/package/fields.pdf.

Emp.variog for computation of empirical variogram of forecast errors averaged over time, Variog.fit for estimation of parameters in a geostatistical model using Weighted Least Square, Field.sim for realizations of weather fields with the specified covariance structure.
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 ## Loading data library(fields) library(RandomFields) data(slp) day <-slp$date.obs id <- slp$id.stat coord1 <- slp$lon.stat coord2 <- slp$lat.stat obs <- slp$obs forecast <-slp$forecast data(gridlong) coord1.grid <- gridlong$gridded.lon data(gridlat) coord2.grid <- gridlat$gridded.lat data(forecast.grid) forecast.grid <- forecast.grid\$gridded.forecast ## Specified cutpoints, default values for all the other fields. ## Only empirical variogram computation empirical.variog <- ProbForecastGOP(day=day,obs=obs,forecast=forecast, id=id,coord1=coord1,coord2=coord2,cut.points=seq(0,1000,by=2), max.dist=NULL,nbins=NULL,variog.model="exponential",max.dist.fit=NULL, init.val=NULL,fix.nugget=FALSE,coord1.grid=coord1.grid, coord2.grid=coord2.grid,forecast.grid=forecast.grid,n.sim=10, out="VARIOG",n.displ=4,qt.displ=c(5,50,95)) ## Unspecified cutpoints. ## Fit of a parametric variogram to an empirical variogram. fitted.variog <- ProbForecastGOP(day=day,obs=obs,forecast=forecast,id=id, coord1=coord1,coord2=coord2,cut.points=NULL,max.dist=NULL,nbins=NULL, variog.model="exponential",max.dist.fit=NULL,init.val=NULL,fix.nugget=FALSE, coord1.grid=coord1.grid,coord2.grid=coord2.grid,forecast.grid=forecast.grid, n.sim=10,out="FIT",n.displ=4,qt.displ=c(5,50,95)) ## Unspecified cutpoints. ## Whole routine. simulation.fields <- ProbForecastGOP(day=day,obs=obs,forecast=forecast,id=id,coord1=coord1, coord2=coord2,cut.points=NULL,max.dist=NULL,nbins=NULL,variog.model=NULL, max.dist.fit=NULL,init.val=NULL,fix.nugget=FALSE,coord1.grid=coord1.grid, coord2.grid=coord2.grid,forecast.grid=forecast.grid,n.sim=4,out="SIM", n.displ=4,qt.displ=c(5,50,95))