DataPreprocessing: Extremes Data Preprocessing
In fExtremes: Rmetrics - Modelling Extreme Events in Finance

DataPreprocessing

R Documentation

Extremes Data Preprocessing

Description

A collection and description of functions for data preprocessing of extreme values. This includes tools to separate data beyond a threshold value, to compute blockwise data like block maxima, and to decluster point process data.

The functions are:

`blockMaxima`	Block Maxima from a vector or a time series,
`findThreshold`	Upper threshold for a given number of extremes,
`pointProcess`	Peaks over Threshold from a vector or a time series,
`deCluster`	Declusters clustered point process data.

Usage

blockMaxima(x, block = c("monthly", "quarterly"), doplot = FALSE)
findThreshold(x, n = floor(0.05*length(as.vector(x))), doplot = FALSE)
pointProcess(x, u = quantile(x, 0.95), doplot = FALSE)
deCluster(x, run = 20, doplot = TRUE)

Arguments

`block`	the block size. A numeric value is interpreted as the number of data values in each successive block. All the data is used, so the last block may not contain `block` observations. If the `data` has a `times` attribute containing (in an object of class `"POSIXct"`, or an object that can be converted to that class, see `as.POSIXct`) the times/dates of each observation, then `block` may instead take the character values `"month"`, `"quarter"`, `"semester"` or `"year"`. By default monthly blocks from daily data are assumed.
`doplot`	a logical value. Should the results be plotted? By default `TRUE`.
`n`	a numeric value or vector giving number of extremes above the threshold. By default, `n` is set to an integer representing 5% of the data from the whole data set `x`.
`run`	parameter to be used in the runs method; any two consecutive threshold exceedances separated by more than this number of observations/days are considered to belong to different clusters.
`u`	a numeric value at which level the data are to be truncated. By default the threshold value which belongs to the 95% quantile, `u=quantile(x,0.95)`.
`x`	a numeric data vector from which `findThreshold` and `blockMaxima` determine the threshold values and block maxima values. For the function `deCluster` the argument `x` represents a numeric vector of threshold exceedances with a `times` attribute which should be a numeric vector containing either the indices or the times/dates of each exceedance (if times/dates, the attribute should be an object of class `"POSIXct"` or an object that can be converted to that class; see `as.POSIXct`).

Details

Computing Block Maxima:

The function blockMaxima calculates block maxima from a vector or a time series, whereas the function blocks is more general and allows for the calculation of an arbitrary function FUN on blocks.

Finding Thresholds:

The function findThreshold finds a threshold so that a given number of extremes lie above. When the data are tied a threshold is found so that at least the specified number of extremes lie above.

De-Clustering Point Processes:

The function deCluster declusters clustered point process data so that Poisson assumption is more tenable over a high threshold.

Value

blockMaxima
returns a timeSeries object or a numeric vector of block maxima data.

findThreshold
returns a numeric value or vector of suitable thresholds.

pointProcess
returns a timeSeries object or a numeric vector of peaks over a threshold.

deCluster
returns a timeSeries object or a numeric vector for the declustered point process.

Author(s)

Some of the functions were implemented from Alec Stephenson's R-package evir ported from Alexander McNeil's S library EVIS, Extreme Values in S, some from Alec Stephenson's R-package ismev based on Stuart Coles code from his book, Introduction to Statistical Modeling of Extreme Values and some were written by Diethelm Wuertz.

References

Coles S. (2001); Introduction to Statistical Modelling of Extreme Values, Springer.

Embrechts, P., Klueppelberg, C., Mikosch, T. (1997); Modelling Extremal Events, Springer.

Examples

 
## findThreshold -
# Threshold giving (at least) fifty exceedances for Danish data:
library(timeSeries)
x <- as.timeSeries(data(danishClaims))
findThreshold(x, n = c(10, 50, 100))    

## blockMaxima -
# Block Maxima (Minima) for left tail of BMW log returns:
BMW <- as.timeSeries(data(bmwRet))
colnames(BMW) <- "BMW.RET"
head(BMW)
x <- blockMaxima( BMW, block = 65)
head(x)
## Not run: 
y <- blockMaxima(-BMW, block = 65)    
head(y) 
y <- blockMaxima(-BMW, block = "monthly")    
head(y)
## End(Not run)

## pointProcess -
# Return Values above threshold in negative BMW log-return data:
PP = pointProcess(x = -BMW, u = quantile(as.vector(x), 0.75))
PP
nrow(PP)

## deCluster -
# Decluster the 200 exceedances of a particular  
DC = deCluster(x = PP, run = 15, doplot = TRUE) 
DC
nrow(DC)

fExtremes documentation built on Dec. 22, 2023, 3:01 a.m.