DataPreprocessing | R Documentation |
A collection and description of functions for data
preprocessing of extreme values. This includes tools
to separate data beyond a threshold value, to compute
blockwise data like block maxima, and to decluster
point process data.
The functions are:
blockMaxima | Block Maxima from a vector or a time series, |
findThreshold | Upper threshold for a given number of extremes, |
pointProcess | Peaks over Threshold from a vector or a time series, |
deCluster | Declusters clustered point process data. |
blockMaxima(x, block = c("monthly", "quarterly"), doplot = FALSE)
findThreshold(x, n = floor(0.05*length(as.vector(x))), doplot = FALSE)
pointProcess(x, u = quantile(x, 0.95), doplot = FALSE)
deCluster(x, run = 20, doplot = TRUE)
block |
the block size. A numeric value is interpreted as the number
of data values in each successive block. All the data is used,
so the last block may not contain |
doplot |
a logical value. Should the results be plotted? By
default |
n |
a numeric value or vector giving number of extremes above
the threshold. By default, |
run |
parameter to be used in the runs method; any two consecutive threshold exceedances separated by more than this number of observations/days are considered to belong to different clusters. |
u |
a numeric value at which level the data are to be truncated. By
default the threshold value which belongs to the 95% quantile,
|
x |
a numeric data vector from which |
Computing Block Maxima:
The function blockMaxima
calculates block maxima from a vector
or a time series, whereas the function
blocks
is more general and allows for the calculation of
an arbitrary function FUN
on blocks.
Finding Thresholds:
The function findThreshold
finds a threshold so that a given
number of extremes lie above. When the data are tied a threshold is
found so that at least the specified number of extremes lie above.
De-Clustering Point Processes:
The function deCluster
declusters clustered point process
data so that Poisson assumption is more tenable over a high threshold.
blockMaxima
returns a timeSeries object or a numeric vector of block
maxima data.
findThreshold
returns a numeric value or vector of suitable thresholds.
pointProcess
returns a timeSeries object or a numeric vector of peaks over
a threshold.
deCluster
returns a timeSeries object or a numeric vector for the
declustered point process.
Some of the functions were implemented from Alec Stephenson's
R-package evir
ported from Alexander McNeil's S library
EVIS
, Extreme Values in S, some from Alec Stephenson's
R-package ismev
based on Stuart Coles code from his book,
Introduction to Statistical Modeling of Extreme Values and
some were written by Diethelm Wuertz.
Coles S. (2001); Introduction to Statistical Modelling of Extreme Values, Springer.
Embrechts, P., Klueppelberg, C., Mikosch, T. (1997); Modelling Extremal Events, Springer.
## findThreshold -
# Threshold giving (at least) fifty exceedances for Danish data:
library(timeSeries)
x <- as.timeSeries(data(danishClaims))
findThreshold(x, n = c(10, 50, 100))
## blockMaxima -
# Block Maxima (Minima) for left tail of BMW log returns:
BMW <- as.timeSeries(data(bmwRet))
colnames(BMW) <- "BMW.RET"
head(BMW)
x <- blockMaxima( BMW, block = 65)
head(x)
## Not run:
y <- blockMaxima(-BMW, block = 65)
head(y)
y <- blockMaxima(-BMW, block = "monthly")
head(y)
## End(Not run)
## pointProcess -
# Return Values above threshold in negative BMW log-return data:
PP = pointProcess(x = -BMW, u = quantile(as.vector(x), 0.75))
PP
nrow(PP)
## deCluster -
# Decluster the 200 exceedances of a particular
DC = deCluster(x = PP, run = 15, doplot = TRUE)
DC
nrow(DC)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.