EL0 | R Documentation |
Empirical likelihood with counts to solve one-dimensional problems efficiently with Brent's root search algorithm.
Conducts an empirical likelihood ratio test of the hypothesis that the mean of z
is mu
.
The names of the elements in the returned list are consistent with the original R code in \insertCiteowen2017weightedsmoothemplik.
EL0(
z,
mu = NULL,
ct = NULL,
shift = NULL,
return.weights = FALSE,
SEL = FALSE,
weight.tolerance = NULL,
boundary.tolerance = 1e-09,
trunc.to = 0,
chull.fail = c("taylor", "wald", "adjusted", "balanced", "none"),
uniroot.control = list(),
verbose = FALSE
)
z |
Numeric data vector. |
mu |
Hypothesized mean of |
ct |
Numeric count variable with non-negative values that indicates the multiplicity of observations.
Can be fractional. Very small counts below the threshold |
shift |
The value to add in the denominator (useful in case there are extra Lagrange multipliers): 1 + lambda'Z + shift. |
return.weights |
Logical: if TRUE, individual EL weights are computed and returned. Setting this to FALSE gives huge memory savings in large data sets, especially when smoothing is used. |
SEL |
If |
weight.tolerance |
Weight tolerance for counts to improve numerical stability (similar to the ones in Art B. Owen's 2017 code, but adapting to the sample size). |
boundary.tolerance |
Relative tolerance for determining when the lambda is not an interior solution because it is too close to the boundary. Corresponds to a fraction of the interval range length. |
trunc.to |
Counts under |
chull.fail |
A character: what to do if the convex hull of |
uniroot.control |
A list passed to the |
verbose |
Logical: if |
This function provides the core functionality for univariate empirical likelihood. The technical details is given in \insertCitecosma2019inferencesmoothemplik, although the algorithm used in that paper is slower than the one provided by this function.
Since we know that the EL probabilities belong to (0, 1), the interval (bracket) for \lambda
search
can be determined in the spirit of formula (2.9) from \insertCiteowen2001empiricalsmoothemplik. Let
z^*_i := z_i - \mu
be the recentred observations.
p_i = c_i/N \cdot (1 + \lambda z^*_i + s)^{-1}
The probabilities are bounded from above: p_i<1
for all i, therefore,
c_i/N \cdot (1 + \lambda z^*_i + s)^{-1} < 1
c_i/N - 1 - s < \lambda z^*_i
Two cases: either z^*_i<0
, or z^*_i>0
(cases with z^*_i=0
are trivially excluded because they do not affect the EL). Then,
(c_i/N - 1 - s)/z^*_i > \lambda,\ \forall i: z^*_i<0
(c_i/N - 1 - s)/z^*_i < \lambda,\ \forall i: z^*_i>0
which defines the search bracket:
\lambda_{\min} := \max_{i: z^*_i>0} (c_i/N - 1 - s)/z^*_i
\lambda_{\max} := \min_{i: z^*_i<0} (c_i/N - 1 - s)/z^*_i
\lambda_{\min} < \lambda < \lambda_{\max}
(This derivation contains s, which is the extra shift that extends the
function to allow mixed conditional and unconditional estimation;
Owen's textbook formula corresponds to s = 0
.)
The actual tolerance of the lambda search in brentZero
is
2 |\lambda_{\max}| \epsilon_m + \mathrm{tol}/2
,
where tol
can be set in uniroot.control
and
\epsilon_m
is .Machine$double.eps
.
The sum of log-weights is maximised without Taylor expansion, forcing mu
to be inside
the convex hull of z
. If a violation is happening, consider using the chull.fail
argument
or switching to Euclidean likelihood via [EuL()].
A list with the following elements:
Logarithm of the empirical likelihood ratio.
The Lagrange multiplier.
Observation weights/probabilities (of the same length as z
).
TRUE
if the algorithm converged, FALSE
otherwise (usually means that mu
is not within the range of z
, i.e. the one-dimensional convex hull of z
).
The number of iterations used (from brentZero
).
The admissible interval for lambda (that is, yielding weights between 0 and 1).
The approximate estimated precision of lambda (from brentZero
).
The value of the derivative of the objective function w.r.t. lambda at the root (from brentZero
). Values > sqrt(.Machine$double.eps)
indicate convergence problems.
An integer indicating the reason of termination.
Character string describing the optimisation termination status.
[EL()]
# Figure 2.4 from Owen (2001) -- with a slightly different data point
earth <- c(
5.5, 5.61, 4.88, 5.07, 5.26, 5.55, 5.36, 5.29, 5.58, 5.65, 5.57, 5.53, 5.62, 5.29,
5.44, 5.34, 5.79, 5.1, 5.27, 5.39, 5.42, 5.47, 5.63, 5.34, 5.46, 5.3, 5.75, 5.68, 5.85
)
set.seed(1)
system.time(r1 <- replicate(40, EL(sample(earth, replace = TRUE), mu = 5.517)))
set.seed(1)
system.time(r2 <- replicate(40, EL0(sample(earth, replace = TRUE), mu = 5.517)))
plot(apply(r1, 2, "[[", "logelr"), apply(r1, 2, "[[", "logelr") - apply(r2, 2, "[[", "logelr"),
bty = "n", xlab = "log(ELR) computed via dampened Newthon method",
main = "Discrepancy between EL and EL0", ylab = "")
abline(h = 0, lty = 2)
# Handling the convex hull violation differently
EL0(1:9, chull.fail = "none")
EL0(1:9, chull.fail = "taylor")
EL0(1:9, chull.fail = "wald")
# Interpolation to well-defined branches outside the convex hull
mu.seq <- seq(-1, 7, 0.1)
wEL1 <- -2*sapply(mu.seq, function(m) EL0(1:9, mu = m, chull.fail = "none")$logelr)
wEL2 <- -2*sapply(mu.seq, function(m) EL0(1:9, mu = m, chull.fail = "taylor")$logelr)
wEL3 <- -2*sapply(mu.seq, function(m) EL0(1:9, mu = m, chull.fail = "wald")$logelr)
plot(mu.seq, wEL1)
lines(mu.seq, wEL2, col = 2)
lines(mu.seq, wEL3, col = 4)
# Warning: depending on the compiler, the discrepancy between EL and EL0
# can be one million (1) times larger than the machine epsilon despite both of them
# being written in pure R
# The results from Apple clang-1400.0.29.202 and Fortran GCC 12.2.0 are different from
# those obtained under Ubuntu 22.04.4 + GCC 11.4.0-1ubuntu1~22.04,
# Arch Linux 6.6.21 + GCC 14.1.1, and Windows Server 2022 + GCC 13.2.0
out1 <- EL(earth, mu = 5.517)[1:4]
out2 <- EL0(earth, mu = 5.517, return.weights = TRUE)[1:4]
print(c(out1$lam, out2$lam), 16)
# Value of lambda EL EL0
# aarch64-apple-darwin20 -1.5631313955?????? -1.5631313957?????
# Windows, Ubuntu, Arch -1.563131395492627 -1.563131395492627
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.