Description Usage Arguments Details Value Acknowledgments Note Author(s) References See Also Examples
Density, cumulative distribution function, quantile function and
random number generation for the extreme value mixture model with
kernel density estimate for bulk distribution between thresholds and
conditional GPD beyond thresholds and continuity at both of them. The parameters are the kernel bandwidth
lambda
, lower tail (threshold ul
,
GPD shape xil
and tail fraction phiul
)
and upper tail (threshold ur
, GPD shape
xiR
and tail fraction phiur
).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | dgkgcon(x, kerncentres, lambda = NULL,
ul = as.vector(quantile(kerncentres, 0.1)), xil = 0, phiul = TRUE,
ur = as.vector(quantile(kerncentres, 0.9)), xir = 0, phiur = TRUE,
bw = NULL, kernel = "gaussian", log = FALSE)
pgkgcon(q, kerncentres, lambda = NULL,
ul = as.vector(quantile(kerncentres, 0.1)), xil = 0, phiul = TRUE,
ur = as.vector(quantile(kerncentres, 0.9)), xir = 0, phiur = TRUE,
bw = NULL, kernel = "gaussian", lower.tail = TRUE)
qgkgcon(p, kerncentres, lambda = NULL,
ul = as.vector(quantile(kerncentres, 0.1)), xil = 0, phiul = TRUE,
ur = as.vector(quantile(kerncentres, 0.9)), xir = 0, phiur = TRUE,
bw = NULL, kernel = "gaussian", lower.tail = TRUE)
rgkgcon(n = 1, kerncentres, lambda = NULL,
ul = as.vector(quantile(kerncentres, 0.1)), xil = 0, phiul = TRUE,
ur = as.vector(quantile(kerncentres, 0.9)), xir = 0, phiur = TRUE,
bw = NULL, kernel = "gaussian")
|
x |
quantiles |
kerncentres |
kernel centres (typically sample data vector or scalar) |
lambda |
bandwidth for kernel (as half-width of kernel) or |
ul |
lower tail threshold |
xil |
lower tail GPD shape parameter |
phiul |
probability of being below lower threshold [0, 1] or |
ur |
upper tail threshold |
xir |
upper tail GPD shape parameter |
phiur |
probability of being above upper threshold [0, 1] or |
bw |
bandwidth for kernel (as standard deviations of kernel) or |
kernel |
kernel name ( |
log |
logical, if TRUE then log density |
q |
quantiles |
lower.tail |
logical, if FALSE then upper tail probabilities |
p |
cumulative probabilities |
n |
sample size (positive integer) |
Extreme value mixture model combining kernel density estimate (KDE) for the bulk between thresholds and GPD beyond thresholds and continuity at both of them.
The user can pre-specify phiul
and phiur
permitting a parameterised value for the tail fractions φ_ul and φ_ur.
Alternatively, when
phiul=TRUE
and phiur=TRUE
the tail fractions are estimated as the tail
fractions from the KDE bulk model.
The alternate bandwidth definitions are discussed in the
kernels
, with the lambda
as the default.
The bw
specification is the same as used in the
density
function.
The possible kernels are also defined in kernels
with the "gaussian"
as the default choice.
Notice that the tail fraction cannot be 0 or 1, and the sum of upper and lower tail
fractions phiul + phiur < 1
, so the lower threshold must be less than the upper,
ul < ur
.
The cumulative distribution function has three components. The lower tail with
tail fraction φ_{ul} defined by the KDE bulk model (phiul=TRUE
)
upto the lower threshold x < u_l:
F(x) = H(u_l) [1 - G_l(x)].
where H(x) is the kernel density estimator cumulative distribution function (i.e.
mean(pnorm(x, kerncentres, bw))
and
G_l(X) is the conditional GPD cumulative distribution function with negated
x value and threshold, i.e. pgpd(-x, -ul, sigmaul, xil, phiul)
. The KDE
bulk model between the thresholds u_l ≤ x ≤ u_r given by:
F(x) = H(x).
Above the threshold x > u_r the usual conditional GPD:
F(x) = H(u_r) + [1 - H(u_r)] G_r(x)
where G_r(X) is the GPD cumulative distribution function,
i.e. pgpd(x, ur, sigmaur, xir, phiur)
.
The cumulative distribution function for the pre-specified tail fractions φ_{ul} and φ_{ur} is more complicated. The unconditional GPD is used for the lower tail x < u_l:
F(x) = φ_{ul} [1 - G_l(x)].
The KDE bulk model between the thresholds u_l ≤ x ≤ u_r given by:
F(x) = φ_{ul}+ (1-φ_{ul}-φ_{ur}) (H(x) - H(u_l)) / (H(u_r) - H(u_l)).
Above the threshold x > u_r the usual conditional GPD:
F(x) = (1-φ_{ur}) + φ_{ur} G(x)
Notice that these definitions are equivalent when φ_{ul} = H(u_l) and φ_{ur} = 1 - H(u_r).
The continuity constraint at ur
means that:
φ_{ur} g_r(x) = (1-φ_{ul}-φ_{ur}) h(u_r)/ (H(u_r) - H(u_l)).
By rearrangement, the GPD scale parameter sigmaur
is then:
σ_ur = φ_{ur} (H(u_r) - H(u_l))/ h(u_r) (1-φ_{ul}-φ_{ur}).
where h(x), g_l(x) and g_r(x) are the KDE and conditional GPD density functions for lower and upper tail respectively. In the special case of where the tail fraction is defined by the bulk model this reduces to
σ_ur = [1-H(u_r)] / h(u_r)
.
The continuity constraint at ul
means that:
φ_{ul} g_l(x) = (1-φ_{ul}-φ_{ur}) h(u_l)/ (H(u_r) - H(u_l)).
The GPD scale parameter sigmaul
is replaced by:
σ_ul = φ_{ul} (H(u_r) - H(u_l))/ h(u_l) (1-φ_{ul}-φ_{ur}).
In the special case of where the tail fraction is defined by the bulk model this reduces to
σ_ul = H(u_l)/ h(u_l)
.
If no bandwidth is provided lambda=NULL
and bw=NULL
then the normal
reference rule is used, using the bw.nrd0
function, which is
consistent with the density
function. At least two kernel
centres must be provided as the variance needs to be estimated.
See gpd
for details of GPD upper tail component and
dkden
for details of KDE bulk component.
dgkgcon
gives the density,
pgkgcon
gives the cumulative distribution function,
qgkgcon
gives the quantile function and
rgkgcon
gives a random sample.
Based on code by Anna MacDonald produced for MATLAB.
Unlike most of the other extreme value mixture model functions the
gkgcon
functions have not been vectorised as
this is not appropriate. The main inputs (x
, p
or q
)
must be either a scalar or a vector, which also define the output length.
The kerncentres
can also be a scalar or vector.
The kernel centres kerncentres
can either be a single datapoint or a vector
of data. The kernel centres (kerncentres
) and locations to evaluate density (x
)
and cumulative distribution function (q
) would usually be different.
Default values are provided for all inputs, except for the fundamentals
kerncentres
, x
, q
and p
. The default sample size for
rgkgcon
is 1.
Missing (NA
) and Not-a-Number (NaN
) values in x
,
p
and q
are passed through as is and infinite values are set to
NA
. None of these are not permitted for the parameters or kernel centres.
Due to symmetry, the lower tail can be described by GPD by negating the quantiles.
Error checking of the inputs (e.g. invalid probabilities) is carried out and will either stop or give warning message as appropriate.
Yang Hu and Carl Scarrott carl.scarrott@canterbury.ac.nz.
http://en.wikipedia.org/wiki/Kernel_density_estimation
http://en.wikipedia.org/wiki/Generalized_Pareto_distribution
Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf
Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360.
Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179.
MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.
Wand, M. and Jones, M.C. (1995). Kernel Smoothing. Chapman && Hall.
kernels
, kfun
,
density
, bw.nrd0
and dkde
in ks
package.
Other kdengpdcon: bckdengpdcon
,
fbckdengpdcon
, fgkgcon
,
fkdengpdcon
, fkdengpd
,
kdengpdcon
, kdengpd
Other gkg: fgkgcon
, fgkg
,
fkdengpd
, gkg
,
kdengpd
, kden
Other gkgcon: fgkgcon
, fgkg
,
fkdengpdcon
, gkg
,
kdengpdcon
Other bckdengpdcon: bckdengpdcon
,
bckdengpd
, bckden
,
fbckdengpdcon
, fbckdengpd
,
fbckden
, fkdengpdcon
,
kdengpdcon
Other fgkgcon: fgkgcon
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ## Not run:
set.seed(1)
par(mfrow = c(2, 2))
kerncentres=rnorm(1000,0,1)
x = rgkgcon(1000, kerncentres, phiul = 0.15, phiur = 0.15)
xx = seq(-6, 6, 0.01)
hist(x, breaks = 100, freq = FALSE, xlim = c(-6, 6))
lines(xx, dgkgcon(xx, kerncentres, phiul = 0.15, phiur = 0.15))
# three tail behaviours
plot(xx, pgkgcon(xx, kerncentres), type = "l")
lines(xx, pgkgcon(xx, kerncentres,xil = 0.3, xir = 0.3), col = "red")
lines(xx, pgkgcon(xx, kerncentres,xil = -0.3, xir = -0.3), col = "blue")
legend("topleft", paste("Symmetric xil=xir=",c(0, 0.3, -0.3)),
col=c("black", "red", "blue"), lty = 1)
# asymmetric tail behaviours
x = rgkgcon(1000, kerncentres, xil = -0.3, phiul = 0.1, xir = 0.3, phiur = 0.1)
xx = seq(-6, 6, 0.01)
hist(x, breaks = 100, freq = FALSE, xlim = c(-6, 6))
lines(xx, dgkgcon(xx, kerncentres, xil = -0.3, phiul = 0.1, xir = 0.3, phiur = 0.1))
plot(xx, dgkgcon(xx, kerncentres, xil = -0.3, phiul = 0.2, xir = 0.3, phiur = 0.2),
type = "l", ylim = c(0, 0.4))
lines(xx, dgkgcon(xx, kerncentres, xil = -0.3, phiul = 0.3, xir = 0.3, phiur = 0.3),
col = "red")
lines(xx, dgkgcon(xx, kerncentres, xil = -0.3, phiul = TRUE, xir = 0.3, phiur = TRUE),
col = "blue")
legend("topleft", c("phiul = phiur = 0.2", "phiul = phiur = 0.3", "Bulk Tail Fraction"),
col=c("black", "red", "blue"), lty = 1)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.