Implementation of Pareto smoothed importance sampling, a method for stabilizing importance weights. For full details about the algorithm see Vehtari, Gelman and Gabry (2016a, 2016b). For diagnostics see the pareto-k-diagnostic page.

1 2 |

`lw` |
A matrix or vector of log weights. For computing LOO, |

`wcp` |
The proportion of importance weights to use for the generalized
Pareto fit. The |

`wtrunc` |
For truncating very large weights to |

`cores` |
The number of cores to use for parallelization. This can be set
for an entire R session by |

`llfun, llargs` |
See |

`...` |
Ignored when |

A named list with components `lw_smooth`

(modified log weights)
and `pareto_k`

(estimated generalized
Pareto shape parameter(s) k).

The distribution of the importance weights used in LOO may
have a long right tail. We use the empirical Bayes estimate of Zhang and
Stephens (2009) to fit a generalized Pareto distribution (gPd) to the tail
(20% largest importance ratios). By examining the shape parameter *k*
of the fitted gPd, we are able to obtain sample based estimates of the
existance of the moments (Koopman et al, 2009). This extends the diagnostic
approach of Peruggia (1997) and Epifani et al. (2008) to be used routinely
with importance-sampling LOO for any model with a factorizing likelihood.

Epifani et al. (2008) show that when estimating the leave-one-out
predictive density, the central limit theorem holds if the variance of the
weight distribution is finite. These results can be extended using the
generalized central limit theorem for stable distributions. Thus, even if
the variance of the importance weight distribution is infinite, if the mean
exists the estimate's accuracy improves when additional draws are obtained.
When the tail of the weight distribution is long, a direct use of
importance sampling is sensitive to one or few largest values. By fitting a
gPd to the upper tail of the importance weights we smooth these values. The
procedure (implemented in the `psislw`

function) goes as
follows:

Fit the gPd to the 20% largest importance ratios

*r_s*. The computation is done separately for each held-out data point*i*. In simulation experiments with thousands and tens of thousands of draws, we have found that the fit is not sensitive to the specific cutoff value (for a consistent estimation the proportion of the samples above the cutoff should get smaller when the number of draws increases).Stabilize the importance ratios by replacing the

*M*largest ratios by the expected values of the order statistics of the fitted gPd*G((z - 0.5)/M), z = 1,...,M,*where

*M*is the number of simulation draws used to fit the Pareto (in this case,*M = 0.2*S*) and*G*is the inverse-CDF of the gPd.To guarantee finite variance of the estimate, truncate the smoothed ratios with

*S^{3/4}\bar{w},*where

*\bar{w}*is the average of the smoothed weights.

The above steps must be performed for each data point *i*. The result is
a vector of weights *w_{i}^{s}, s = 1,...,S*, for each *i*, which in
general should be better behaved than the raw importance ratios
*r_{i}^{s}* from which they were constructed.

The results can be then combined to compute the desired LOO estimates.

Vehtari, A., Gelman, A., and Gabry, J. (2016a). Practical
Bayesian model evaluation using leave-one-out cross-validation and WAIC.
*Statistics and Computing*. Advance online publication.
doi:10.1007/s11222-016-9696-4.
(published
version, arXiv preprint).

Vehtari, A., Gelman, A., and Gabry, J. (2016b). Pareto smoothed importance sampling. arXiv preprint: http://arxiv.org/abs/1507.02646/

`pareto-k-diagnostic`

for PSIS diagnostics.

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.