smoothEmplik: Smoothed Empirical Likelihood function value

View source: R/selfunctions.R

smoothEmplikR Documentation

Smoothed Empirical Likelihood function value

Description

Evaluates SEL function for a given moment function at a certain parameter value.

Usage

smoothEmplik(
  rho,
  theta,
  data,
  sel.weights = NULL,
  type = c("EL", "EuL", "EL0"),
  kernel.args = list(bw = NULL, kernel = "epanechnikov", order = 2, PIT = TRUE, sparse =
    TRUE),
  EL.args = list(chull.fail = "taylor", weight.tolerance = NULL),
  minus = FALSE,
  parallel = FALSE,
  cores = 1,
  chunks = NULL,
  sparse = FALSE,
  verbose = FALSE,
  bad.value = -Inf,
  attach.attributes = c("none", "all", "ELRs", "residuals", "lam", "nabla", "converged",
    "exitcode", "probabilities"),
  ...
)

Arguments

rho

The moment function depending on parameters and data (and potentially other parameters). Must return a numeric vector.

theta

A parameter at which the moment function is evaluated.

data

A data object on which the moment function is computed.

sel.weights

Either a matrix with valid kernel smoothing weights with rows adding up to 1, or a function that computes the kernel weights based on the data argument passed to ....

type

Character: "EL" for empirical likelihood, "EuL" for Euclidean likelihood, "EL0" for one-dimensional empirical likelihood. "EL0" is *strongly* recommended for 1-dimensional moment functions because it is faster and more robust: it searches for the Lagrange multiplier directly and has nice fail-safe options for convex hull failure.

kernel.args

A list of arguments passed to kernelWeights() if sel.weights is a function.

EL.args

A list of arguments passed to EL(), EL0(), or EuL().

minus

If TRUE, returns SEL times -1 (for optimisation via minimisation).

parallel

If TRUE, uses parallel::mclapply to speed up the computation.

cores

The number of cores used by parallel::mclapply.

chunks

The number of chunks into which the weight matrix is split for memory saving. One chunk is good for sample sizes 2000 and below. If equal to the number of observations, then, the smoothed likelihoods are computed in series, which saves memory but computes kernel weights at every step of a loop, increasing CPU time. If parallel is TRUE, parallelisation occurs within each chunk.

sparse

Logical: convert the weight matrix to a sparse one?

verbose

If TRUE, a progress bar is made to display the evaluation progress in case partial or full memory saving is in place.

bad.value

Replace non-finite individual SEL values with this value. May be useful if the optimiser does not allow specific non-finite values (like L-BFGS-B).

attach.attributes

If "none", returns just the sum of expected likelihoods; otherwise, attaches certain attributes for diagnostics: "ELRs" for expected likelihoods, "residuals" for the residuals (moment function values), "lam" for the Lagrange multipliers lambda in the EL problems, "nabla" for d/d(lambda)EL (should be close to zero because this must be true for any theta), "converged" for the convergence of #' individual EL problems, "exitcode" for the EL exit codes (0 for success), "probabilities" for the matrix of weights (very large, not recommended for sample sizes larger than 2000).

...

Passed to rho.

Value

A scalar with the SEL value and, if requested, attributes containing the diagnostic information attached to it.

Examples

set.seed(1)
x <- sort(rlnorm(50))
# Heteroskedastic DGP
y <- abs(1 + 1*x + rnorm(50) * (1 + x + sin(x)))
mod.OLS <- lm(y ~ x)
rho <- function(theta, ...) y - theta[1] - theta[2]*x  # Moment fn
w <- kernelWeights(x, PIT = TRUE, bw = 0.25, kernel = "epanechnikov")
w <- w / rowSums(w)
image(x, x, w, log = "xy")
theta.vals <- list(c(1, 1), coef(mod.OLS))
SEL <- function(b, ...) smoothEmplik(rho = rho, theta = b, sel.weights = w, ...)
sapply(theta.vals, SEL) # Smoothed empirical likelihood
# SEL maximisation
ctl <- list(fnscale = -1, reltol = 1e-6, ndeps = rep(1e-5, 2),
            trace = 1, REPORT = 5)
b.init <- coef(mod.OLS)
b.init <- c(1.790207, 1.007491)  # Only to speed up estimation
b.SEL <- optim(b.init, SEL, method = "BFGS", control = ctl)
print(b.SEL$par) # Closer to the true value (1, 1) than OLS
plot(x, y)
abline(1, 1, lty = 2)
abline(mod.OLS, col = 2)
abline(b.SEL$par, col = 4)

# Euclidean likelihood
SEuL <- function(b, ...)  smoothEmplik(rho = rho, theta = b,
                                       type = "EuL", sel.weights = w, ...)
b.SEuL <- optim(coef(mod.OLS), SEuL, method = "BFGS", control = ctl)
abline(b.SEuL$par, col = 3)
cbind(SEL = b.SEL$par, SEuL = b.SEuL$par)

# Now we start from (0, 0), for which the Taylor expansion is necessary
# because all residuals at this starting value are positive and the
# unmodified EL ratio for the test of equality to 0 is -Inf
smoothEmplik(rho=rho, theta=c(0, 0), sel.weights = w, EL.args = list(chull.fail = "none"))
smoothEmplik(rho=rho, theta=c(0, 0), sel.weights = w)

# The next example is very slow; approx. 1 minute

# Experiment: a small bandwidth so that the spanning condition should fail often
# It yields an appalling estimator
w <- kernelWeights(x, PIT = TRUE, bw = 0.15, kernel = "epanechnikov")
w <- w / rowSums(w)
# The first option is faster but it may sometimes fails
b.SELt <- optim(c(0, 0), SEL, EL.args = list(chull.fail = "taylor"),
                method = "BFGS", control = ctl)
b.SELw <- optim(c(0, 0), SEL, EL.args = list(chull.fail = "wald"),
                method = "BFGS", control = ctl)
# In this sense, Euclidean likelihood is robust to convex hull violations
b.SELu <- optim(c(0, 0), SEuL, method = "BFGS", control = ctl)
b0grid <- seq(-1.5, 7, length.out = 51)
b1grid <- seq(-1.5, 4.5, length.out = 51)
bgrid <- as.matrix(expand.grid(b0grid, b1grid))
fi <- function(i) smoothEmplik(rho, bgrid[i, ], sel.weights = w, type = "EL0",
                  EL.args = list(chull.fail = "taylor"))
ncores <- max(floor(parallel::detectCores()/2 - 1), 1)
chk <- Sys.getenv("_R_CHECK_LIMIT_CORES_", "")  # Limit to 2 cores for CRAN checks
if (nzchar(chk) && chk == "TRUE") ncores <- min(ncores, 2L)
selgrid <- unlist(parallel::mclapply(1:nrow(bgrid), fi, mc.cores = ncores))
selgrid <- matrix(selgrid, nrow = length(b0grid))
probs <- c(0.25, 0.5, 0.75, 0.8, 0.9, 0.95, 0.99, 1-10^seq(-4, -16, -2))
levs <- qchisq(probs, df = 2)
# levs <- c(1, 2, 5, 10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000)
labs <- round(levs, 1)
cols <- rainbow(length(levs), end = 0.7, v = 0.7)
oldpar <- par(mar = c(4, 4, 2, 0) + .1)
selgrid2 <- -2*(selgrid - max(selgrid, na.rm = TRUE))
contour(b0grid, b1grid, selgrid2, levels = levs,
        labels = labs, col = cols, lwd = 1.5, bty = "n",
        main = "'Safe' likelihood contours", asp = 1)
image(b0grid, b1grid, log1p(selgrid2))
# The narrow lines are caused by the fact that if two observations are close together
# at the edge, the curvature at that point is extreme

# The same with Euclidean likelihood
seulgrid <- unlist(parallel::mclapply(1:nrow(bgrid), function(i)
  smoothEmplik(rho, bgrid[i, ], sel.weights = w, type = "EuL"),
    mc.cores = ncores))
seulgrid <- matrix(seulgrid, nrow = length(b0grid))
seulgrid2 <- -50*(seulgrid - max(seulgrid, na.rm = TRUE))
par(mar = c(4, 4, 2, 0) + .1)
contour(b0grid, b1grid, seulgrid2, levels = levs,
        labels = labs, col = cols, lwd = 1.5, bty = "n",
        main = "'Safe' likelihood contours", asp = 1)
image(b0grid, b1grid, log1p(seulgrid2))
par(oldpar)


smoothemplik documentation built on Aug. 22, 2025, 1:11 a.m.