ebnm_scale_npmle: Set scale parameter for NPMLE and deconvolveR prior family

View source: R/grid_selection.R

ebnm_scale_npmleR Documentation

Set scale parameter for NPMLE and deconvolveR prior family

Description

The default method for setting the scale parameter for functions ebnm_npmle and ebnm_deconvolver.

Usage

ebnm_scale_npmle(
  x,
  s,
  min_K = 3,
  max_K = 300,
  KLdiv_target = 1/length(x),
  pointmass = TRUE
)

Arguments

x

A vector of observations. Missing observations (NAs) are not allowed.

s

A vector of standard errors (or a scalar if all are equal). Standard errors may not be exactly zero, and missing standard errors are not allowed.

min_K

The minimum number of components K to include in the mixture of point masses used to approximate the nonparametric family of all distributions.

max_K

The maximum number of components K to include in the approximating mixture of point masses.

KLdiv_target

The desired bound \kappa on the KL-divergence from the solution obtained using the approximating mixture to the exact solution. More precisely, the scale parameter is set such that given the exact MLE

\hat{g} := \mathrm{argmax}_{g \in G} L(g),

where G is the full nonparametric family, and given the MLE for the approximating family \tilde{G}

\tilde{g} := \mathrm{argmax}_{g \in \tilde{G}} L(g),

we have that

\mathrm{KL}(\hat{g} \ast N(0, s^2) \mid \tilde{g} \ast N(0, s^2)) \le \kappa,

where \ast \ N(0, s^2) denotes convolution with the normal error distribution (the derivation of the bound assumes homoskedastic observations). For details, see References below.

pointmass

When the range of the data is so large that max_K point masses cannot provide a good approximation to the family of all distributions, then ebnm will instead use a mixture of normal distributions, with the standard deviation of each component equal to scale/ 2. Setting pointmass = FALSE gives the default scale for this mixture of normal distributions.

To be exact, ebnm uses a mixture of normal distributions rather than a mixture of point masses when

\frac{\max(x) - \min(x)}{\min(s)} > 3 \ \mathrm{max}_K;

for a rationale, see References below. Note however that ebnm only uses a mixture of normal distributions when scale = "estimate"; if parameter scale is set manually, then a mixture of point masses will be used in all cases. To use a mixture of normal distributions with the scale set manually, an object created by the constructor function normalmix must be provided as argument to parameter g_init in function ebnm_npmle or function ebnm_deconvolver.

References

Jason Willwerscheid (2021). Empirical Bayes Matrix Factorization: Methods and Applications. University of Chicago, PhD dissertation.


ebnm documentation built on Oct. 13, 2023, 1:16 a.m.