Description Usage Arguments Details Value References See Also Examples
Resolve equal values (a.k.a., tied values) of observations by randomizing locally according to a Gaussian distribution with a rather small variance.
1 |
x |
a numeric vector containing the data. |
sw |
a positive number; it specifies the spread width of the randomization procedure; its default value is from the minimal gap between two different values of observations. |
The essential histogram (Li et al, 2016) is designed based on the assumption that the underlying distribution function is continuous. Such assumption is natural as it guarantees the existence of density with respect to the Lebesgue measure. However, in pratice, one also faces discrete distributions, whose distribution function is piece-wise constant, thus discontinuous. The function smData
implements a simple idea of adapting the essential histogram to discrete data: more precisely, the Dirac delta density is approximated by a thin Guassian density, and the resulted approximation has continuous distribution.
The function smData
is automatically called, when essHistogram
is called. Note that smData
only sorts the observations x
if there is no tied values.
A vector of length length(x)
is returned, i.e., modified observations with no tied values, and ordered increasingly.
Li, H., Munk, A., Sieling, H., and Walther, G. (2016). The essential histogram. arXiv:1612.07216.
1 2 3 4 5 6 7 8 9 10 11 | # generate Poisson data (discrete)
set.seed(123)
n = 100 # number of observations
lambda = 5
x.dis = rpois(n, lambda)
# smooth discrete data
x.sm = smData(x.dis)
# compute the essential histogram
eh = essHistogram(x.sm, xname = "Poisson")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.