# Parameter estimation for the Generalized Pareto Distribution (GPD)

### Description

Fits exceedances above a chosen threshold to the Generalized Pareto model. Various estimation procedures can be used, including maximum likelihood, probability weighted moments, and maximum product spacing. It also allows generalized linear modeling of the parameters.

### Usage

1 2 3 4 5 |

### Arguments

`data` |
Data should be a numeric vector from the GPD. |

`threshold` |
A threshold value or vector of the same length as the data. |

`nextremes` |
Number of upper extremes to be used (either this or the threshold must be given, but not both). |

`npp` |
Length of each period (typically year). Is used in return level estimation. Defaults to 365. |

`method` |
Method of estimation - maximum likelihood (mle), maximum product spacing (mps), and probability weighted moments (pwm). Uses mle by default. For pwm, only the stationary model can be fit. |

`information` |
Whether standard errors should be calculated via observed or expected (default) information. For probability weighted moments, only expected information will be used if possible. For non-stationary models, only observed information is used. |

`scalevars, shapevars` |
A dataframe of covariates to use for modeling of the each parameter. Parameter intercepts are automatically handled by the function. Defaults to NULL for the stationary model. |

`scaleform, shapeform` |
An object of class ‘formula’ (or one that can be coerced into that class), specifying the model of each parameter. By default, assumes stationary (intercept only) model. See details. |

`scalelink, shapelink` |
A link function specifying the relationship between the covariates and each parameter. Defaults to the identity function. For the stationary model, only the identity link should be used. |

`start` |
Option to provide a set of starting parameters to optim; a vector of scale and shape, in that order. Otherwise, the routine attempts to find good starting parameters. See details. |

`opt` |
Optimization method to use with optim. |

`maxit` |
Number of iterations to use in optimization, passed to optim. Defaults to 10,000. |

`...` |
Additional arguments to pass to optim. |

### Details

The base code for finding probability weighted moments is taken from the R package evir. See citation.
In the stationary case (no covariates), starting parameters for mle and mps estimation are the probability weighted moment estimates.
In the case where covariates are used, the starting intercept parameters are the probability weighted moment estimates from the
stationary case and the parameters based on covariates are initially set to zero. For non-stationary parameters, the
first reported estimate refers to the intercept term. Covariates are centered and scaled automatically to speed up optimization,
and then transformed back to original scale.

Formulas for generalized linear modeling of the parameters should be given in the form '~ var1 + var2 + *\cdots*'. Essentially,
specification here is the same as would be if using function ‘lm’ for only the right hand side of the equation. Interactions,
polynomials, etc. can be handled as in the ‘formula’ class.

Intercept terms are automatically handled by the function. By default, the link functions are the identity function and
the covariate dependent scale parameter estimates are forced to be positive. For some link function *f(\cdot)* and for
example, scale parameter *σ*, the link is written as *σ = f(σ_1 x_1 + σ_2 x_2 + \cdots + σ_k x_k)*.

Maximum likelihood estimation and maximum product spacing estimation can be used in all cases. Probability weighted moments
can only be used for stationary models.

### Value

A class object ‘gpdFit’ describing the fit, including parameter estimates and standard errors.

### References

Pfaff, Bernhard, Alexander McNeil, and A. Stephenson. "evir: Extreme Values in R." R package version (2012): 1-7.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | ```
## Fit data using the three different estimation procedures
set.seed(7)
x <- rgpd(2000, loc = 0, scale = 2, shape = 0.2)
## Set threshold at 4
mle_fit <- gpdFit(x, threshold = 4, method = "mle")
pwm_fit <- gpdFit(x, threshold = 4, method = "pwm")
mps_fit <- gpdFit(x, threshold = 4, method = "mps")
## Look at the difference in parameter estimates and errors
mle_fit$par.ests
pwm_fit$par.ests
mps_fit$par.ests
mle_fit$par.ses
pwm_fit$par.ses
mps_fit$par.ses
## A linear trend in the scale parameter
set.seed(7)
n <- 300
x2 <- rgpd(n, loc = 0, scale = 1 + 1:n / 200, shape = 0)
covs <- as.data.frame(seq(1, n, 1))
names(covs) <- c("Trend1")
result1 <- gpdFit(x2, threshold = 0, scalevars = covs, scaleform = ~ Trend1)
## Show summary of estimates
result1
``` |