Description Usage Arguments Details Value Note Author(s) References See Also Examples
Maximum likelihood estimation for fitting the hybrid Pareto extreme value mixture model
1 2 3 4 5 6 |
x |
vector of sample data |
pvector |
vector of initial values of parameters
( |
std.err |
logical, should standard errors be calculated |
method |
optimisation method (see |
control |
optimisation control list (see |
finitelik |
logical, should log-likelihood return finite value for invalid parameters |
... |
optional inputs passed to |
nmean |
scalar normal mean |
nsd |
scalar normal standard deviation (positive) |
xi |
scalar shape parameter |
log |
logical, if |
The hybrid Pareto model is fitted to the entire dataset using maximum likelihood estimation. The estimated parameters, variance-covariance matrix and their standard errors are automatically output.
The log-likelihood and negative log-likelihood are also provided for wider
usage, e.g. constructing profile likelihood functions. The parameter vector
pvector
must be specified in the negative log-likelihood
nlhpd
.
Log-likelihood calculations are carried out in
lhpd
, which takes parameters as inputs in
the same form as distribution functions. The negative log-likelihood is a
wrapper for lhpd
, designed towards making
it useable for optimisation (e.g. parameters are given a vector as first
input).
Missing values (NA
and NaN
) are assumed to be invalid data so are ignored,
which is inconsistent with the evd
library which assumes the
missing values are below the threshold.
The function lhpd
carries out the calculations
for the log-likelihood directly, which can be exponentiated to give actual
likelihood using (log=FALSE
).
The default optimisation algorithm is "BFGS", which requires a finite negative
log-likelihood function evaluation finitelik=TRUE
. For invalid
parameters, a zero likelihood is replaced with exp(-1e6)
. The "BFGS"
optimisation algorithms require finite values for likelihood, so any user
input for finitelik
will be overridden and set to finitelik=TRUE
if either of these optimisation methods is chosen.
It will display a warning for non-zero convergence result comes from
optim
function call.
If the hessian is of reduced rank then the variance covariance (from inverse hessian)
and standard error of parameters cannot be calculated, then by default
std.err=TRUE
and the function will stop. If you want the parameter estimates
even if the hessian is of reduced rank (e.g. in a simulation study) then
set std.err=FALSE
.
lhpd
gives (log-)likelihood and
nlhpd
gives the negative log-likelihood.
fhpd
returns a simple list with the following elements
call : | optim call |
x : | data vector x |
init : | pvector |
optim : | complete optim output |
mle : | vector of MLE of parameters |
cov : | variance-covariance matrix of MLE of parameters |
se : | vector of standard errors of MLE of parameters |
rate : | phiu to be consistent with evd |
nllh : | minimum negative log-likelihood |
n : | total sample size |
nmean : | MLE of normal mean |
nsd : | MLE of normal standard deviation |
u : | threshold (implicit from other parameters) |
sigmau : | MLE of GPD scale |
xi : | MLE of GPD shape |
phiu : | MLE of tail fraction (implied by 1/(1+pnorm(u,nmean,nsd)) ) |
The output list has some duplicate entries and repeats some of the inputs to both
provide similar items to those from fpot
and to make it
as useable as possible.
Unlike most of the distribution functions for the extreme value mixture models, the MLE fitting only permits single scalar values for each parameter. Only the data is a vector.
When pvector=NULL
then the initial values are calculated, type
fhpd
to see the default formulae used. The mixture model fitting can be
***extremely*** sensitive to the initial values, so you if you get a poor fit then
try some alternatives. Avoid setting the starting value for the shape parameter to
xi=0
as depending on the optimisation method it may be get stuck.
A default value for the tail fraction phiu=TRUE
is given.
The lhpd
also has the usual defaults for
the other parameters, but nlhpd
has no defaults.
Invalid parameter ranges will give 0
for likelihood, log(0)=-Inf
for
log-likelihood and -log(0)=Inf
for negative log-likelihood.
Infinite and missing sample values are dropped.
Error checking of the inputs is carried out and will either stop or give warning message as appropriate.
Yang Hu and Carl Scarrott carl.scarrott@canterbury.ac.nz
http://en.wikipedia.org/wiki/Normal_distribution
http://en.wikipedia.org/wiki/Generalized_Pareto_distribution
Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf
Carreau, J. and Y. Bengio (2008). A hybrid Pareto model for asymmetric fat-tailed data: the univariate case. Extremes 12 (1), 53-76.
The condmixt package written by one of the original authors of the hybrid Pareto model (Carreau and Bengio, 2008) also has similar functions for the likelihood of the hybrid Pareto (hpareto.negloglike) and fitting (hpareto.fit).
Other hpd: fhpdcon
, hpdcon
,
hpd
Other hpdcon: fhpdcon
, hpdcon
,
hpd
Other normgpd: fgng
,
fitmnormgpd
, flognormgpd
,
fnormgpdcon
, fnormgpd
,
gngcon
, gng
,
hpdcon
, hpd
,
itmnormgpd
, lognormgpdcon
,
lognormgpd
, normgpdcon
,
normgpd
Other fhpd: hpd
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | ## Not run:
set.seed(1)
par(mfrow = c(1, 1))
x = rnorm(1000)
xx = seq(-4, 4, 0.01)
y = dnorm(xx)
# Hybrid Pareto provides reasonable fit for some asymmetric heavy upper tailed distributions
# but not for cases such as the normal distribution
fit = fhpd(x, std.err = FALSE)
hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4))
lines(xx, y)
with(fit, lines(xx, dhpd(xx, nmean, nsd, xi), col="red"))
abline(v = fit$u)
# Notice that if tail fraction is included a better fit is obtained
fit2 = fnormgpdcon(x, std.err = FALSE)
with(fit2, lines(xx, dnormgpdcon(xx, nmean, nsd, u, xi), col="blue"))
abline(v = fit2$u)
legend("topright", c("Standard Normal", "Hybrid Pareto", "Normal+GPD Continuous"),
col=c("black", "red", "blue"), lty = 1)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.