Empirical likelihood ratio for discrete hazard with right censored, left truncated data

Share:

Description

Use empirical likelihood ratio and Wilks theorem to test the null hypothesis that

∑_i[f(x_i, θ) \log(1- dH(x_i))] = K

where H(t) is the (unknown) discrete cumulative hazard function; f(t,θ) can be any predictable function of t. θ is the parameter of the function and K is a given constant. The data can be right censored and left truncated.

When the given constants θ and/or K are too far away from the NPMLE, there will be no hazard function satisfy this constraint and the minus 2Log empirical likelihood ratio will be infinite. In this case the computation will stop.

Usage

1
emplikH.disc(x, d, y= -Inf, K, fun, tola=.Machine$double.eps^.25, theta)

Arguments

x

a vector, the observed survival times.

d

a vector, the censoring indicators, 1-uncensor; 0-censor.

y

optional vector, the left truncation times.

K

a real number used in the constraint, sum to this value.

fun

a left continuous (weight) function used to calculate the weighted discrete hazard in H_0. fun(x, theta) must be able to take a vector input x, and a parameter theta.

tola

an optional positive real number specifying the tolerance of iteration error in solve the non-linear equation needed in constrained maximization.

theta

a given real number used as the parameter of the function f.

Details

The log likelihood been maximized is the ‘binomial’ empirical likelihood:

∑ D_i \log w_i + (R_i-D_i) \log [1-w_i]

where w_i = Δ H(t_i) is the jump of the cumulative hazard function, D_i is the number of failures observed at t_i, R_i is the number of subjects at risk at time t_i.

For discrete distributions, the jump size of the cumulative hazard at the last jump is always 1. We have to exclude this jump from the summation since \log( 1- dH(\cdot)) do not make sense.

The constants theta and K must be inside the so called feasible region for the computation to continue. This is similar to the requirement that in testing the value of the mean, the value must be inside the convex hull of the observations. It is always true that the NPMLE values are feasible. So when the computation stops, try move the theta and K closer to the NPMLE. When the computation stops, the -2LLR should have value infinite.

In case you do not need the theta in the definition of the function f, you still need to formally define your fun function with a theta input, just to match the arguments.

Value

A list with the following components:

times

the location of the hazard jumps.

wts

the jump size of hazard function at those locations.

lambda

the final value of the Lagrange multiplier.

"-2LLR"

The discrete -2Log Likelihood ratio.

Pval

P-value

niters

number of iterations used

Author(s)

Mai Zhou

References

Fang, H. (2000). Binomial Empirical Likelihood Ratio Method in Survival Analysis. Ph.D. Thesis, Univ. of Kentucky, Dept of Statistics.

Zhou and Fang (2001). “Empirical likelihood ratio for 2 sample problem for censored data”. Tech Report, Univ. of Kentucky, Dept of Statistics

Zhou, M. and Fang, H. (2006). A comparison of Poisson and binomial empirical likelihood. Tech Report, Univ. of Kentucky, Dept of Statistics

Examples

1
2
3
4
5
6
7
8
fun4 <- function(x, theta) { as.numeric(x <= theta) }
x <- c(1, 2, 3, 4, 5, 6, 5, 4, 3, 4, 1, 2.4, 4.5)
d <- c(1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1)
# test if -H(4) = -0.7 
emplikH.disc(x=x,d=d,K=-0.7,fun=fun4,theta=4)
# we should get "-2LLR" 0.1446316  etc....
y <- c(-2,-2, -2, 1.5, -1)
emplikH.disc(x=x,d=d,y=y,K=-0.7,fun=fun4,theta=4)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.