pareto_diags: Pareto smoothing diagnostics

View source: R/pareto_smooth.R

pareto_diagsR Documentation

Pareto smoothing diagnostics

Description

Compute diagnostics for Pareto smoothing the tail draws of x by replacing tail draws by order statistics of a generalized Pareto distribution fit to the tail(s).

Usage

pareto_diags(x, ...)

## Default S3 method:
pareto_diags(
  x,
  tail = c("both", "right", "left"),
  r_eff = NULL,
  ndraws_tail = NULL,
  verbose = FALSE,
  ...
)

## S3 method for class 'rvar'
pareto_diags(x, ...)

Arguments

x

(multiple options) One of:

  • A matrix of draws for a single variable (iterations x chains). See extract_variable_matrix().

  • An rvar.

...

Arguments passed to individual methods (if applicable).

tail

(string) The tail to diagnose/smooth:

  • "right": diagnose/smooth only the right (upper) tail

  • "left": diagnose/smooth only the left (lower) tail

  • "both": diagnose/smooth both tails and return the maximum k-hat value

The default is "both".

r_eff

(numeric) relative effective sample size estimate. If r_eff is omitted, it will be calculated assuming the draws are from MCMC.

ndraws_tail

(numeric) number of draws for the tail. If ndraws_tail is not specified, it will be calculated as ceiling(3 * sqrt(length(x) / r_eff)) if length(x) > 225 and length(x) / 5 otherwise (see Appendix H in Vehtari et al. (2022)).

verbose

(logical) Should diagnostic messages be printed? If TRUE, messages related to Pareto diagnostics will be printed. Default is FALSE.

Details

When the fitted Generalized Pareto Distribution is used to smooth the tail values and these smoothed values are used to compute expectations, the following diagnostics can give further information about the reliability of these estimates.

  • min_ss: Minimum sample size for reliable Pareto smoothed estimate. If the actual sample size is greater than min_ss, then Pareto smoothed estimates can be considered reliable. If the actual sample size is lower than min_ss, increasing the sample size might result in more reliable estimates. For further details, see Section 3.2.3, Equation 11 in Vehtari et al. (2022).

  • khat_threshold: Threshold below which k-hat values result in reliable Pareto smoothed estimates. The threshold is lower for smaller effective sample sizes. If k-hat is larger than the threshold, increasing the total sample size may improve reliability of estimates. For further details, see Section 3.2.4, Equation 13 in Vehtari et al. (2022).

  • convergence_rate: Relative convergence rate compared to the central limit theorem. Applicable only if the actual sample size is sufficiently large (greater than min_ss). The convergence rate tells the rate at which the variance of an estimate reduces when the sample size is increased, compared to the central limit theorem convergence rate. See Appendix B in Vehtari et al. (2022).

Value

List of Pareto smoothing diagnostics:

  • khat: estimated Pareto k shape parameter,

  • min_ss: minimum sample size for reliable Pareto smoothed estimate,

  • khat_threshold: khat-threshold for reliable Pareto smoothed estimate,

  • convergence_rate: Pareto smoothed estimate RMSE convergence rate.

References

Aki Vehtari, Daniel Simpson, Andrew Gelman, Yuling Yao and Jonah Gabry (2022). Pareto Smoothed Importance Sampling. arxiv:arXiv:1507.02646

Examples

mu <- extract_variable_matrix(example_draws(), "mu")
pareto_diags(mu)

d <- as_draws_rvars(example_draws("multi_normal"))
pareto_diags(d$Sigma)

posterior documentation built on Nov. 2, 2023, 5:56 p.m.