# avg.variog: Empirical variogram of a random variable averaged over time In ProbForecastGOP: Probabilistic weather forecast using the GOP method

## Description

Calculates the empirical variogram of a random variable averaged over time.

## Usage

 1 avg.variog(day, coord1, coord2, id, variable, cut.points=NULL, max.dist=NULL, nbins=300) 

## Arguments

 day numeric vector containing the day of observation. coord1 vector containing the longitudes of the metereological stations. coord2 vector containing the latitudes of the metereological stations. id vector with the id of the metereological stations. variable numeric vector containing the variable for which the empirical varigram is to be computed. cut.points numeric vector containing the cutpoints used for variogram binning. max.dist a numerical value giving the upper bound for the distance considered in the variogram computation. nbins a numerical value giving the number of bins for variogram binning. If both cut.points and nbins are entered, the entry for nbins will be ignored and the vector with the cutpoints will instead be used for variogram binning.

## Details

The empirical variogram of the given random variable is calculated by determining, for each day, the distance among all pairs of stations that have been observed in the same day and by calculating for each day the sum of all the squared differences in the given random variable within each bin. These sums are then averaged over time, with weights for each bin given by the sum over time of the number of pairs of stations within the bin.

The formula used is:

γ(h) = ∑_d \frac{1}{2N_{(h,d)}} (∑_i (Y(x_{i}+h,d)-Y(x_{i},d))^2)

where γ(h) is the empirical variogram at distance h, N_{(h,d)} is the number of pairs of stations that have been recorded at day d and whose distance is equal to h, and Y(x_{i}+h,d) and Y(x_{i},d) are, respectively, the values of the given variable observed on day d at stations located at x_{i}+h and x_{i}. Variogram binning is ignored in this formula.

- Defaults -

If the vector with the cutpoints is not specified, the cutpoints are determined so that there are nbins bins with approximately the same number of pairs per bin.

If both the vector with the cutpoints and the number of bins, nbins, are unspecified, the function by default determines the cutpoints so that there are 300 bins with approximately the same number of pairs per bin. If both the vector with the cutpoints and the number of bins are provided, the entry for the number of bins is ignored and the vector with the cutpoints is used for variogram binning.

The default value for the maximum distance considered in the variogram computation is the 90-th percentile of the distances between the stations.

## Value

The function returns a list with components given by:

 mar.var Marginal variance of the variable for which the empirical variogram is computed. bin.midpoints Numeric vector with midpoints of the bins used in the empirical variogram computation. number.pairs Numeric vector with the number of pairs per bin. empir.variog Numeric vector with the empirical variogram values.

## Note

Depending on the data, the function might require substantial computing time.

## Author(s)

Berrocal, V. J. veroberrocal@gmail.com, Gel, Y., Raftery, A. E., Gneiting, T.

## References

Gel, Y., Raftery, A. E., Gneiting, T. (2004). Calibrated probabilistic mesoscale weather field forecasting: The Geostatistical Output Perturbation (GOP) method (with discussion). Journal of the American Statistical Association, Vol. 99 (467), 575–583.

Cressie, N. A. C. (1993). Statistics for Spatial Data (revised ed.). Wiley: New York.

avg.variog.dir for directional empirical variogram of a random variable averaged over time, Emp.variog and EmpDir.variog for, respectively, empirical and directional empirical variogram of forecast errors averaged over time, and Variog.fit for estimation of parameters in a parametric variogram model.
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ## Loading data data(slp) day <- slp$date.obs id <- slp$id.stat coord1 <- slp$lon.stat coord2 <- slp$lat.stat obs <- slp$obs forecast <- slp$forecast ## Computing variogram of observed temperature ## No specified cutpoints, no specified maximum distance ## Default number of bins variogram <- avg.variog(day=day, coord1=coord1,coord2=coord2,id=id,variable=obs,cut.points=NULL,max.dist=NULL,nbins=NULL) ## Plotting variogram plot(variogram$bin.midpoints,variogram$empir.variog,xlab="Distance", ylab="Semi-variance",main="Empirical variogram")