# Emp.variog: Empirical variogram of forecast errors averaged over time In ProbForecastGOP: Probabilistic weather forecast using the GOP method

## Description

Calculates the empirical variogram of forecast errors, averaged over time.

## Usage

 1 Emp.variog(day, obs, forecast, id, coord1, coord2, cut.points=NULL, max.dist=NULL, nbins=300) 

## Arguments

 day numeric vector containing the day of observation. obs numeric vector containing the observed weather quantity. forecast numeric vector containing the forecasted weather quantity. id vector with the id of the metereological stations. coord1 vector containing the longitudes of the metereological stations. coord2 vector containing the latitudes of the metereological stations. cut.points numeric vector containing the cutpoints used for variogram binning. max.dist a numerical value giving the upper bound for the distance considered in the variogram computation. nbins a numerical value giving the number of bins for variogram binning. If both cut.points and nbins are entered, the entry for nbins will be ignored and the vector with the cutpoints will instead be used for variogram binning.

## Details

The function includes bias-correction; it regresses the forecasts on the observed weather quantity and computes the residuals. The empirical variogram of the residuals is then calculated by determining, for each day, the distance among all pairs of stations that have been observed in the same day and by calculating for each day the sum of all the squared differences in the residuals within each bin. These sums are then averaged over time, with weights for each bin given by the sum over time of the number of pairs of stations within the bin.

The formula used is:

γ(h) = ∑_d \frac{1}{2N_{(h,d)}} (∑_i (Y(x_{i}+h,d)-Y(x_{i},d))^2)

where γ(h) is the empirical variogram at distance h, N_{(h,d)} is the number of pairs of stations that have been recorded at day d and whose distance is equal to h, and Y(x_{i}+h,d) and Y(x_{i},d) are, respectively, the values of the residuals on day d at stations located at x_{i}+h and x_{i}. Variogram binning is ignored in this formula.

- Defaults

If the vector with the cutpoints is not specified, the cutpoints are determined so that there are nbins bins with approximately the same number of pairs per bin.

If both the vector with the cutpoints and the number of bins, nbins, are unspecified, the function by default determines the cutpoints so that there are 300 bins with approximately the same number of pairs per bin. If both the vector with the cutpoints and the number of bins are provided, the entry for the number of bins is ignored and the vector with the cutpoints is used for variogram binning.

The default value for the maximum distance considered in the variogram computation is the 90-th percentile of the distances between the stations.

## Value

The function returns a list with components given by:

 mar.var Marginal variance of the forecast errors. bin.midpoints Numeric vector with midpoints of the bins used in the empirical variogram computation. number.pairs Numeric vector with the number of pairs per bin. empir.variog Numeric vector with the empirical variogram values.

## Note

Depending on the data, the function might require substantial computing time. As a consequence, if the interest is in producing probabilistic weather forecasts and generating ensemble members, it is advised to save the output in a file and then use the Variog.fit and Field.sim functions.

## Author(s)

Berrocal, V. J. (veroberrocal@gmail.com), Raftery, A. E., Gneiting, T., Gel, Y.

## References

Gel, Y., Raftery, A. E., Gneiting, T. (2004). Calibrated probabilistic mesoscale weather field forecasting: The Geostatistical Output Perturbation (GOP) method (with discussion). Journal of the American Statistical Association, Vol. 99 (467), 575–583.

Cressie, N. A. C. (1993). Statistics for Spatial Data (revised ed.). Wiley: New York.

EmpDir.variog for directional empirical variogram of forecast errors averaged over time, avg.variog and avg.variog.dir for, respectively, empirical and directional empirical variogram of a random variable averaged over time, and Variog.fit for estimation of parameters in a parametric variogram model.
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 ## Loading data data(slp) day <- slp$date.obs id <- slp$id.stat coord1 <- slp$lon.stat coord2 <- slp$lat.stat obs <- slp$obs forecast <- slp$forecast ## Computing variogram ## No specified cutpoints, no specified maximum distance ## Default number of bins variogram <- Emp.variog(day=day,obs=obs,forecast=forecast,id=id, coord1=coord1,coord2=coord2,cut.points=NULL,max.dist=NULL,nbins=NULL) ## Plotting variogram plot(variogram$bin.midpoints,variogram$empir.variog,xlab="Distance", ylab="Semi-variance",main="Empirical variogram") ## Computing variogram ## Specified cutpoints, specified maximum distance ## Unspecified number of bins variogram <- Emp.variog(day=day,obs=obs,forecast=forecast,id=id,coord1=coord1, coord2=coord2,cut.points=seq(0,1000,by=5),max.dist=800,nbins=NULL) ## Plotting variogram plot(variogram$bin.midpoints,variogram$empir.variog,xlab="Distance", ylab="Semi-variance",main="Empirical variogram")