Two sample empirical likelihood ratio for discrete hazards with right censored, left truncated data. Many constraints.

Share:

Description

Use empirical likelihood ratio and Wilks theorem to test the null hypothesis that

\int{f_1(t) I_{[dH_1 <1]} \log(1-dH_1(t))} - \int{f_2(t) I_{[dH_2 <1]} \log(1-dH_2(t))} = θ

where H_*(t) are the (unknown) discrete cumulative hazard functions; f_*(t) can be any predictable functions of t. θ is a vector of parameters (dim=q >= 1). The given value of θ in these computation are the value to be tested. The data can be right censored and left truncated.

When the given constants θ is too far away from the NPMLE, there will be no hazard function satisfy this constraint and the -2 Log empirical likelihood ratio will be infinite. In this case the computation will stop.

Usage

1
2
emplikHs.disc2(x1, d1, y1= -Inf, x2, d2, y2 = -Inf,
          theta, fun1, fun2, maxit=25,tola = 1e-6, itertrace =FALSE)

Arguments

x1

a vector, the observed survival times, sample 1.

d1

a vector, the censoring indicators, 1-uncensor; 0-censor.

y1

optional vector, the left truncation times.

x2

a vector, the observed survival times, sample 2.

d2

a vector, the censoring indicators, 1-uncensor; 0-censor.

y2

optional vector, the left truncation times.

fun1

a predictable function used to calculate the weighted discrete hazard in H_0. fun1(x) must be able to take a vector input (length n) x, and output a matrix of n x q.

fun2

Ditto.

tola

an optional positive real number, the tolerance of iteration error in solve the non-linear equation needed in constrained maximization.

theta

a given vector of length q. for Ho constraint.

maxit

integer, maximum number of iteration.

itertrace

Logocal, lower bound for lambda

Details

The log empirical likelihood been maximized is the ‘binomial empirical likelihood’:

∑ D_{1i} \log w_i + (R_{1i}-D_{1i}) \log [1-w_i] + ∑ D_{2j} \log v_j + (R_{2j}-D_{2j}) \log [1-v_j]

where w_i = Δ H_1(t_i) is the jump of the cumulative hazard function at t_i, D_{1i} is the number of failures observed at t_i, and R_{1i} is the number of subjects at risk at time t_i (for sample one). Similar for sample two.

For discrete distributions, the jump size of the cumulative hazard at the last jump is always 1. We have to exclude this jump from the summation in the constraint calculation since \log( 1- dH(\cdot)) do not make sense. In the likelihood, this term contribute a zero (0*Inf).

This function can handle multiple constraints. So dim( theta) = q. The constants theta must be inside the so called feasible region for the computation to continue. This is similar to the requirement that in testing the value of the mean, the value must be inside the convex hull of the observations. It is always true that the NPMLE values are feasible. So when the computation stops, try move the theta closer to the NPMLE. When the computation stops, the -2LLR should have value infinite.

This code can also be used to compute one sample problems. You need to artificially supply data for sample two (with minimal sample size (2q+2)), and supply a function fun2 that ALWAYS returns zero (zero vector or zero matrix). In the output, read the -2LLR(sample1).

Value

A list with the following components:

times1

the location of the hazard jumps in sample 1.

times2

the location of the hazard jumps in sample 2.

lambda

the final value of the Lagrange multiplier.

"-2LLR"

The -2Log Likelihood ratio.

"-2LLR(sample1)"

The -2Log Likelihood ratio for sample 1 only.

niters

number of iterations used

Author(s)

Mai Zhou

References

Zhou and Fang (2001). “Empirical likelihood ratio for 2 sample problems for censored data”. Tech Report, Univ. of Kentucky, Dept of Statistics

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
if(require("boot", quietly = TRUE)) {
####library(boot)
data(channing)
ymale <- channing[1:97,2]
dmale <- channing[1:97,5]
xmale <- channing[1:97,3]
yfemale <- channing[98:462,2]
dfemale <- channing[98:462,5]
xfemale <- channing[98:462,3]
fun1 <- function(x) { as.numeric(x <= 960) }
########################################################
emplikHs.disc2(x1=xfemale, d1=dfemale, y1=yfemale, 
 x2=xmale, d2=dmale, y2=ymale, theta=0.25, fun1=fun1, fun2=fun1)
########################################################
### This time you get "-2LLR" = 1.150098 etc. etc.
##############################################################
fun2 <- function(x){ cbind(as.numeric(x <= 960), as.numeric(x <= 860))}
############ fun2 has matrix output ###############
emplikHs.disc2(x1=xfemale, d1=dfemale, y1=yfemale, 
 x2=xmale, d2=dmale, y2=ymale, theta=c(0.25,0), fun1=fun2, fun2=fun2)
################# you get "-2LLR" = 1.554386, etc ###########
}

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.