As different terms of use and policies apply to the different datasets stored in the ECOMS User Data Gateway, a fine-grained user authorization scheme has been implemented using the THREDDS Administration Panel (TAP), which allows user registration and data access authorization for the ECOMS partners (EUPORIAS, SPECS and NACLIM projects), for which this document is initially conceived.

In all cases, dataset access is conditioned to the acceptance of the particular usage terms and conditions. Further instructions on TAP registration are provided here.

`ecomsUDG.Raccess`

packagePackage `ecomsUDG.Raccess`

is available in a public GitHub repository. The recommended installation procedure is to use the `install_github`

command from the `devtools`

R package. More details on the package installation process, as well as some advice regarding the most common installation problems can be found at the ECOMS-UDG wiki.

devtools::install_github(c("SantanderMetGroup/downscaleR.java", "SantanderMetGroup/downscaleR", "SantanderMetGroup/ecomsUDG.Raccess"))

Note that apart from the `ecomsUDG.Raccess`

package, two more dependencies are installed. `downscaleR.java`

is internally used, containing the netCDF Java API used by the other packages, but the user never calls explicitly to any of its functions. Furthermore, the package `downscaleR`

has some utilities internally used by the `ecomsUDG.Raccess`

package, plus some other utilities that we will explicitly use during the next examples (plotting, interpolation...).

Once a valid username and a password are obtained from the TAP, providing authorization for the required datasets, login can be performed from R using the `loginECOMS_UDG`

function (see the `help(loginECOMS_UDG)`

for details in case of proxy connections).

```
library(ecomsUDG.Raccess)
```

loginECOMS_UDG(username = "myUser", password = "999a")

Data download is straightforward using the `loadECOMS`

function. Argument names are intuitive, and the function has been conceived thinking in the needs of the downscaling and impact research communities, being simple to obtain spatio-temporal slices for particular spatial domains, forecast times, seasons and years.

Suppose we are interested obtaining data from the ECMWF's System4 seasonal forecasting model of 15 members[^1] (`dataset = "System4_seasonal_15"`

). In particular, in this example we will retrieve maximum daily surface temperature (`var = "tasmax"`

) for boreal summer (JJA, `season = 6:8`

), for a rectangular domain centered on the Iberian Peninsula and France (`lonLim = c(-10,15)`

and `latLim = c(35,50)`

), for the period 1981-2000 (`years = 1981:2000`

), and considering the first 9 ensemble members (`members = 1:9`

) and a lead-month 2 forecast[^2] (`leadMonth = 2`

). We will illustrate the verification of these data against the observational gridded datasets WATCH Forcing Dataset-ERA-Interim (WFDEI, `dataset = "WFDEI"`

), also available via the ECOMS-UDG.

[^1]: More specific worked examples for different cases are presented in the examples section of the ECOMS wiki.

[^2]: See this link details on the configuration of this particular dataset.

tx.forecast <- loadECOMS(dataset = "System4_seasonal_15", var = "tasmax", members = 1:9, lonLim = c(-10 ,15), latLim = c(35, 50), season = 6:8, years = 1991:2000, leadMonth = 2) ## [2015-01-19 14:54:01] Defining homogeneization parameters for variable "tasmax" ## [2015-01-19 14:54:01] Defining geo-location parameters ## [2015-01-19 14:54:01] Defining initialization time parameters ## [2015-01-19 14:54:05] Retrieving data subset ... ## [2015-01-19 14:55:09] Done

load("s4.Rdata")

The data loaded can be quickly inspected using the `plotMeanField`

tool[^3]. We can plot a map of the temporal mean (mean maximum temperature summer 1991-2000) either as a multi-member mean (argument `multimember = FALSE`

), or considering each member sepparately (`multimember = TRUE`

), as in this example:

[^3]: Loading times may differ significantly depending on various factors. More details here

plotMeanField(tx.forecast, multi.member = TRUE)

Similarly, we next load the reference observation data from the gridded dataset WFDEI. Note that in this case, the argument referring to the forecast time (`leadMonth`

), is omitted, as the data loaded is not a forecast.

tx.obs <- loadECOMS(dataset = "WFDEI", var = "tasmax", lonLim = c(-10 ,15), latLim = c(35, 50), season = 6:8, years = 1991:2000) ## [2015-01-19 14:59:03] Defining homogeneization parameters for variable "tasmax" ## [2015-01-19 14:59:03] Defining geo-location parameters ## [2015-01-19 14:59:03] Defining time selection parameters ## [2015-01-19 14:59:04] Retrieving data subset ... ## [2015-01-19 14:59:05] Done

load("wfdei.Rdata")

This is the map (temporal mean) of the observed reference data we want to use for validation:

```
plotMeanField(tx.obs)
```

`easyVerification`

packageSimilar to installing the `ecomsUDG.Raccess`

, you can install the `easyVerification`

package providing verification scores and the functionality to compute these scores on large datasets. To install `easyVerification`

please install the `SpecsVerification`

package from CRAN first to make use of the validation metrics supplied with `SpecsVerification`

.

install.packages("SpecsVerification") devtools::install_github("MeteoSwiss/easyVerification", build_vignettes=TRUE)

To find out more on the functionality in the `easyVerification`

package, please read the help to the functions, the vignette with `vignette("easyVerification")`

or the README on GitHub.

After installing and loading the `easyVerification`

package, we compute a few forecast scores and skill scores to illustrate how `easyVerification`

can be used with seasonal forecast data downloaded from the ECOMS-UDG. To be able to validate the forecasts, we first have to interpolate either the forecasts or the observations to have the data on a common grid. We interpolate the observations to the grid of the forecasts using bilinear interpolation.

tx.obsintp <- interpGridData(tx.obs, new.grid=getGrid(tx.forecast), method="bilinear")

Before we go on, we compute 3-monthly averages of the forecasts and observations for validation of seasonal average daily maximum temperature.

annualmean <- function(x){ xout <- x is.time <- which(attr(x$Data, 'dimensions') == 'time') isnot.time <- which(attr(x$Data, 'dimensions') != 'time') years <- getYearsAsINDEX(x) xout$Data <- aperm(apply(x$Data, isnot.time, tapply, years, mean), seq(along=dim(x$Data))[c(is.time, isnot.time)]) xout$Dates$start <- tapply(x$Dates$start, years, min) xout$Dates$end <- tapply(x$Dates$end, years, max) return(xout) } mn.tx.forecast <- annualmean(tx.forecast) mn.tx.obsintp <- annualmean(tx.obsintp)

Now we're ready to compute validation scores on the 3-monthly mean daily maximum temperature forecasts. First we start by analysing the mean bias of the forecasts.

suppressPackageStartupMessages({ library(easyVerification) library(fields) }) bias <- veriApply("EnsMe", fcst=mn.tx.forecast$Data, obs=mn.tx.obsintp$Data, ensdim=1, tdim=2) bias.breaks <- pretty(c(bias, -bias), 50) ncols <- length(bias.breaks) - 1 bias.col <- c(hcl(240, l=seq(20,99,length=ncols/2), c=seq(70,30, length=ncols/2)), hcl(10, l=seq(99,20,length=ncols/2), c=seq(30,70,length=ncols/2)))

image.plot(mn.tx.obsintp$xyCoords$x, mn.tx.obsintp$xyCoords$y, t(bias), breaks=bias.breaks, col=bias.col, xlab='longitude', ylab='latitude', las=1) map(add=T, lwd=2, col='darkgrey')

Next, we compute the correlation between the ensemble mean and verifying observations.

corr <- veriApply("EnsCorr", fcst=mn.tx.forecast$Data, obs=mn.tx.obsintp$Data, ensdim=1, tdim=2) corr.breaks <- seq(-1,1,0.05) ncols <- length(corr.breaks) - 1 corr.col <- c(hcl(240, l=seq(20,99,length=ncols/2), c=seq(70,30, length=ncols/2)), hcl(10, l=seq(99,20,length=ncols/2), c=seq(30,70,length=ncols/2)))

image.plot(mn.tx.obsintp$xyCoords$x, mn.tx.obsintp$xyCoords$y, t(corr), breaks=corr.breaks, col=corr.col, xlab='longitude', ylab='latitude', las=1) map(add=T, lwd=2, col='darkgrey') box()

We find that the ensemble mean summer forecasts for 1991-2000 correlate well with the veryfing observations over most of the western Mediterranean, but forecasts do not skillfully represent year-to-year variability over Switzerland, Austria and southern Germany.

We continue with a probabilistic forecast verification metric that is a measure of how well the forecasts are able to discriminate between varying observations. This generalized discrimination score (also referred to as the two alternatives forced choice score) has been introduced by Mason and Weigel [-@Mason2009]. In `easyVerification`

the generalized discrimination score for continuous ensemble forecasts is included as described in Weigel and Mason [-@Weigel2011].

gds <- veriApply("Ens2AFC", fcst=mn.tx.forecast$Data, obs=mn.tx.obsintp$Data, ensdim=1, tdim=2)

This score ranges from 0 to 1, where 1 denotes perfect resolution, and 0.5 is no resolution.

image.plot(mn.tx.obsintp$xyCoords$x, mn.tx.obsintp$xyCoords$y, t(gds), breaks=corr.breaks/2+0.5, col=corr.col, xlab='longitude', ylab='latitude', las=1) map(add=T, lwd=2, col='darkgrey') box()

Again, we find that the System4 forecasts for summer maximum temperature are more skillful in the south-western part of the domain and one is better of using climatological forecasts rather than the System4 forecasts in the north-eastern part of the domain.

The final skill score we analyse is the ranked probability skill score (RPSS). Here we use the ranked probability skill score for tercile forecasts, that is probability forecasts for the three categories colder than average, average, and warmer than average. In order to convert observations and forecast in probabilities for the three categories, we have to add an additional argument `prob`

to the `veriApply`

function with the quantile boundaries for the categories as shown below.

rpss <- veriApply("EnsRpss", fcst=mn.tx.forecast$Data, obs=mn.tx.obsintp$Data, prob=c(1/3,2/3), ensdim=1, tdim=2)

Please note that along with the RPSS, the function `EnsRpss`

(and thus `veriApply`

as well) also returns the standard error of the RPSS. This allows us to mark where the forecasts have significant skill.

image.plot(mn.tx.obsintp$xyCoords$x, mn.tx.obsintp$xyCoords$y, t(rpss$rpss), breaks=corr.breaks, col=corr.col, xlab='longitude', ylab='latitude', las=1) map(add=T, lwd=2, col='darkgrey') sig.i <- rpss$rpss > rpss$rpss.sigma*qnorm(0.95) lons <- rep(mn.tx.obsintp$xyCoords$x, each=length(mn.tx.obsintp$xyCoords$y)) lats <- rep(mn.tx.obsintp$xyCoords$y, length(mn.tx.obsintp$xyCoords$x)) points(lons[sig.i], lats[sig.i], pch=16, cex=0.5) box()

This vignette illustrates how to download seasonal forecast data from the ECOMS User Data Gateway using the `ecomsUDG.Raccess`

R package and how to validate such forecasts using verification scores from the `easyVerification`

R package. More information on both packages can be found in the respective GitHub repositories (`ecomsUDG.Raccess`

and `easyVerification`

) and on the ECOMS UDG wiki.

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.