EL0: Uni-variate empirical likelihood via direct lambda search
In smoothemplik: Smoothed Empirical Likelihood

View source: R/elfunctions.R

EL0	R Documentation

Uni-variate empirical likelihood via direct lambda search

Description

Empirical likelihood with counts to solve one-dimensional problems efficiently with Brent's root search algorithm. Conducts an empirical likelihood ratio test of the hypothesis that the mean of z is mu. The names of the elements in the returned list are consistent with the original R code in \insertCiteowen2017weightedsmoothemplik.

Usage

EL0(
  z,
  mu = NULL,
  ct = NULL,
  shift = NULL,
  return.weights = FALSE,
  SEL = FALSE,
  weight.tolerance = NULL,
  boundary.tolerance = 1e-09,
  trunc.to = 0,
  chull.fail = c("taylor", "wald", "adjusted", "balanced", "none"),
  uniroot.control = list(),
  verbose = FALSE
)

Arguments

`z`	Numeric data vector.
`mu`	Hypothesized mean of `z` in the moment condition.
`ct`	Numeric count variable with non-negative values that indicates the multiplicity of observations. Can be fractional. Very small counts below the threshold `weight.tolerance` are zeroed.
`shift`	The value to add in the denominator (useful in case there are extra Lagrange multipliers): 1 + lambda'Z + shift.
`return.weights`	Logical: if TRUE, individual EL weights are computed and returned. Setting this to FALSE gives huge memory savings in large data sets, especially when smoothing is used.
`SEL`	If `FALSE`, then the boundaries for the lambda search are based on the total sum of counts, like in vanilla empirical likelihood, due to formula (2.9) in \insertCiteowen2001empiricalsmoothemplik, otherwise according to Cosma et al. (2019, p. 170, the topmost formula).
`weight.tolerance`	Weight tolerance for counts to improve numerical stability (similar to the ones in Art B. Owen's 2017 code, but adapting to the sample size).
`boundary.tolerance`	Relative tolerance for determining when the lambda is not an interior solution because it is too close to the boundary. Corresponds to a fraction of the interval range length.
`trunc.to`	Counts under `weight.tolerance` will be set to this value. In most cases, setting this to `0` or `weight.tolerance` is a viable solution of the zero-denominator problem.
`chull.fail`	A character: what to do if the convex hull of `z` does not contain `mu` (spanning condition does not hold). `"taylor"` creates a Taylor approximation of the log-ELR function near the ends of the sample. `"wald"` smoothly transitions between the log-ELR function into -0.5 * the Wald statistic for the weighted mean of `z`. `"adjusted"` invokes the method of \insertCitechen2008adjustedsmoothemplik, where an extra observation is added to ensure that the convex hull contains the mean, and `"balanced"` calls the method of \insertCiteemerson2009calibrationsmoothemplik and \insertCiteliu2010adjustedsmoothemplik with two extra points.
`uniroot.control`	A list passed to the `brentZero`.
`verbose`	Logical: if `TRUE`, prints warnings.

Details

This function provides the core functionality for univariate empirical likelihood. The technical details is given in \insertCitecosma2019inferencesmoothemplik, although the algorithm used in that paper is slower than the one provided by this function.

Since we know that the EL probabilities belong to (0, 1), the interval (bracket) for \lambda search can be determined in the spirit of formula (2.9) from \insertCiteowen2001empiricalsmoothemplik. Let z^*_i := z_i - \mu be the recentred observations.

p_i = c_i/N \cdot (1 + \lambda z^*_i + s)^{-1}

The probabilities are bounded from above: p_i<1 for all i, therefore,

c_i/N \cdot (1 + \lambda z^*_i + s)^{-1} < 1

c_i/N - 1 - s < \lambda z^*_i

Two cases: either z^*_i<0, or z^*_i>0 (cases with z^*_i=0 are trivially excluded because they do not affect the EL). Then,

(c_i/N - 1 - s)/z^*_i > \lambda,\ \forall i: z^*_i<0

(c_i/N - 1 - s)/z^*_i < \lambda,\ \forall i: z^*_i>0

which defines the search bracket:

\lambda_{\min} := \max_{i: z^*_i>0} (c_i/N - 1 - s)/z^*_i

\lambda_{\max} := \min_{i: z^*_i<0} (c_i/N - 1 - s)/z^*_i

\lambda_{\min} < \lambda < \lambda_{\max}

(This derivation contains s, which is the extra shift that extends the function to allow mixed conditional and unconditional estimation; Owen's textbook formula corresponds to s = 0.)

The actual tolerance of the lambda search in brentZero is 2 |\lambda_{\max}| \epsilon_m + \mathrm{tol}/2, where tol can be set in uniroot.control and \epsilon_m is .Machine$double.eps.

The sum of log-weights is maximised without Taylor expansion, forcing mu to be inside the convex hull of z. If a violation is happening, consider using the chull.fail argument or switching to Euclidean likelihood via [EuL()].

Value

A list with the following elements:

logelr: Logarithm of the empirical likelihood ratio.
lam: The Lagrange multiplier.
wts: Observation weights/probabilities (of the same length as z).
converged: TRUE if the algorithm converged, FALSE otherwise (usually means that mu is not within the range of z, i.e. the one-dimensional convex hull of z).
iter: The number of iterations used (from brentZero).
bracket: The admissible interval for lambda (that is, yielding weights between 0 and 1).
estim.prec: The approximate estimated precision of lambda (from brentZero).
f.root: The value of the derivative of the objective function w.r.t. lambda at the root (from brentZero). Values > sqrt(.Machine$double.eps) indicate convergence problems.
exitcode: An integer indicating the reason of termination.
message: Character string describing the optimisation termination status.

References

\insertAllCited

Examples

# Figure 2.4 from Owen (2001) -- with a slightly different data point
earth <- c(
  5.5, 5.61, 4.88, 5.07, 5.26, 5.55, 5.36, 5.29, 5.58, 5.65, 5.57, 5.53, 5.62, 5.29,
  5.44, 5.34, 5.79, 5.1, 5.27, 5.39, 5.42, 5.47, 5.63, 5.34, 5.46, 5.3, 5.75, 5.68, 5.85
)
set.seed(1)
system.time(r1 <- replicate(40, EL(sample(earth, replace = TRUE), mu = 5.517)))
set.seed(1)
system.time(r2 <- replicate(40, EL0(sample(earth, replace = TRUE), mu = 5.517)))
plot(apply(r1, 2, "[[", "logelr"), apply(r1, 2, "[[", "logelr") - apply(r2, 2, "[[", "logelr"),
     bty = "n", xlab = "log(ELR) computed via dampened Newthon method",
     main = "Discrepancy between EL and EL0", ylab = "")
abline(h = 0, lty = 2)

# Handling the convex hull violation differently
EL0(1:9, chull.fail = "none")
EL0(1:9, chull.fail = "taylor")
EL0(1:9, chull.fail = "wald")

# Interpolation to well-defined branches outside the convex hull
mu.seq <- seq(-1, 7, 0.1)
wEL1 <- -2*sapply(mu.seq, function(m) EL0(1:9, mu = m, chull.fail = "none")$logelr)
wEL2 <- -2*sapply(mu.seq, function(m) EL0(1:9, mu = m, chull.fail = "taylor")$logelr)
wEL3 <- -2*sapply(mu.seq, function(m) EL0(1:9, mu = m, chull.fail = "wald")$logelr)
plot(mu.seq, wEL1)
lines(mu.seq, wEL2, col = 2)
lines(mu.seq, wEL3, col = 4)

# Warning: depending on the compiler, the discrepancy between EL and EL0
# can be one million (1) times larger than the machine epsilon despite both of them
# being written in pure R
# The results from Apple clang-1400.0.29.202 and Fortran GCC 12.2.0 are different from
# those obtained under Ubuntu 22.04.4 + GCC 11.4.0-1ubuntu1~22.04,
# Arch Linux 6.6.21 + GCC 14.1.1, and Windows Server 2022 + GCC 13.2.0
out1 <- EL(earth, mu = 5.517)[1:4]
out2 <- EL0(earth, mu = 5.517, return.weights = TRUE)[1:4]
print(c(out1$lam, out2$lam), 16)

# Value of lambda                                 EL                  EL0
# aarch64-apple-darwin20         -1.5631313955??????   -1.5631313957?????
# Windows, Ubuntu, Arch           -1.563131395492627   -1.563131395492627

smoothemplik documentation built on Aug. 22, 2025, 1:11 a.m.