knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 3 )
The rSWHAP package contains a statistical model for Significant Wave Heights ($H_s$) using Seal Level Pressure and spatial information as covariates. This document describes how to install the package and how to fit the statistical model to a sample data set which is included. Several diagonostick tools, both numerical and visual, are implemented to state the goodness-of-fit of the statistical model.
The rSWHAP package depends on the following R-packages
These dependencies will be automatically installed when installing the rSWHAP package.
The most recent version of the rSWHAP package is hosted on a git repository at https://github.com/NorskRegnesentral/rSWHAP.git.
In order to install the rSWHAP package run the following command:
devtools::install_github("NorskRegnesentral/rSWHAP", build_vignettes = TRUE)
In order to load the package type:
library(rSWHAP)
A small example data set is included in the R-package. ERA-Interim reanalysis of significant wave height ($H_s$) and sea level pressure (SLP) over the area $53.25^{\circ}$ to $- 57.75{\circ}$ longitude and $54.00^{\circ}$ to $58.5^{\circ}$ latitude and for the period 2006 to 2015. The years 2006 - 2014 are used in this example to calibrate and evaluate different statistical models for $H_s$, and year 2015 is used for model evaluation. The data has spatial resolution of $0.75^{\circ} \times 0.75^{\circ}$ and temporal resolutions of six hours.
The table below lists the various variables in the data set.
Variable name Description ----------------- ---------------------------- SWH Significant Wave Height
SLP Sea Level Pressure
SLP.grad Gradient of Sea Level Pressure
latitudeSWH Latitude values of SWH
longitudeSWH Longitude values of SWH
latitudeSLP Latitude values of SLP
longitudeSWH Longitude values of SLP
time.all Time stamps of each ERA-Interim data point
When the R-package is loaded, the data can be loaded using
data("ERAInterim")
To see summary statistics of the SWH in the data:
summary(SWH)
For summary statistics of the SLP in the data:
summary(SLP)
Summary statistics for the gradient of the SLP:
summary(SLP.grad)
Create Fourier terms to use in the modeling. The Fourier terms are needed for modeling seasonality.
You need to specify the number of observations per year and the number of Fourier terms. Default values are 365.25*4 observations per year and 2 Fourier terms.
intercept.fourier = fourier(x = rep(1,length(time.all)))
Define the training set and the test set. In this example data from 2006 to 2014 is used for training and data from 2015 is used for testing (this is also default values).
training.test = split.data(years = years.all, trainingPeriod = 2006:2014, testPeriod = 2015)
Next we fit the model to the training data by estimating the model parameters and obtain predictive distributions. In the example below we have chosen to use latitude cell 4 and longitude cell 4 (the data is a 7x7 latitude longitude grid).
pred.dist = getPreddistr(SWH = SWH, SLP = SLP, SLP.grad = SLP.grad, latCell = 4, longCell = 4, neig = 2, na.thresh = 500, latSWH = latitudeSWH, lonSWH = longitudeSWH, latSLP = latitudeSLP, longSLP = longitudeSLP, intercept.fourier = intercept.fourier, maxlag = 10) # Extract the mean, standard deviation and the # estimated lambda parameter for the predictive # distribution. pred.mean = pred.dist$pred.mean pred.sd = pred.dist$pred.sd pred.lambda = pred.dist$pred.lambda print(pred.dist$fits)
There is a function to simulate from the predictive distribution.
predObs = rpred(n = 1000, mean = pred.mean, sd = pred.sd, lambda = pred.lambda) hist(predObs, col = "gray", xlab = "The predictive distribution", main = "Samples from the predictive distribution")
To explore the model fit we first create the vector of observed SWHs in the test period for the location of interest.
obs <- SWH[4, 4, training.test[[2]]]
One method to assess the goodness of fit of the predictive distribution fits the data is to calculate probability integral transform (PIT) values and plot a PIT histogram for the test set. PIT histograms graphically compare empirical probabilities from fitted models with a uniform distribution.
pit <- pBoxCox(obs, pred.mean, pred.sd, pred.lambda)
We can calculate the mean absolute error over the test period
mae = maeEst(obs = obs, mean = pred.mean, sd = pred.sd, lambda = pred.lambda)
We can calculate the root mean squared error over the test period
rmse = rmseEst(obs = obs, mean = pred.mean, sd = pred.sd, lambda = pred.lambda, Nsamples = 10000)
To calculate the log-score type:
logs = logsEst(obs = obs, mean = pred.mean, sd = pred.sd, lambda = pred.lambda, log = TRUE)
To calculate the reliability index (RIBX) use:
ribx = ribxEst(obs = obs, mean = pred.mean, sd = pred.sd, lambda = pred.lambda)
To calculate the Continuous Ranked Probability Score (CRPS) use:
crps = crpsEst(obs = obs, mean = pred.mean, sd = pred.sd, lambda = pred.lambda, Nsamples = 10000)
You can get a summary table containing all performance measures implemented:
perfM = perfMeasures(obs = obs, mean = pred.mean, sd = pred.sd, lambda = pred.lambda, Nsamples = 10000) #If you don't want to have the results of each printed to screen #but only interested in the numbers use print2screen = FALSE perfM = perfMeasures(obs = obs, mean = pred.mean, sd = pred.sd, lambda = pred.lambda, Nsamples = 10000, print2screen = FALSE) #Have a look at the values perfM
Plot the prediction and the observation in the first and last 100 time points of the test period
plotPred(obs = obs, t.period = c(1:100), mean = pred.mean, sd = pred.sd, lambda = pred.lambda) plotPred(obs = obs, t.period = c(1361:1460), mean = pred.mean, sd = pred.sd, lambda = pred.lambda)
Random predictive trajectories for 10 time points
rPlotPred(obs = obs, t.period = c(40:49), mean = pred.mean, sd = pred.sd, lambda = pred.lambda, n.random = 10)
Learn correlation from previous timepoints (last 100 time points in test period)
rCorr(obs = obs, t.period = c(40:49), mean = pred.mean, sd = pred.sd, lambda = pred.lambda, n.random = 10, training.test = training.test, SWHobs = SWH[4,4,])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.