kullCOP: Kullback-Leibler Divergence, Jeffrey Divergence, and...
In wasquith/copBasic: General Bivariate Copula Theory and Many Utility Functions

kullCOP

R Documentation

Kullback–Leibler Divergence, Jeffrey Divergence, and Kullback–Leibler Sample Size

Description

Compute the Kullback–Leibler Divergence, Jeffrey Divergence, and Kullback–Leibler sample size following Joe (2014, pp. 234–237). Consider two densities f = c_1(u,v; \Theta_f) and g = c_2(u,v; \Theta_g) for two different bivariate copulas \mathbf{C}_1(\Theta_1) and \mathbf{C}_2(\Theta_2) having respective parameters \Theta, then the Kullback–Leibler Divergence of f relative to g is

\mathrm{KL}(f {\mid} g) = \int\!\!\int_{\mathcal{I}^2} g\, \log(g/f)\,\mathrm{d}u\mathrm{d}v\mbox{,}

and Kullback–Leibler Divergence of g relative to f is

\mathrm{KL}(g {\mid} f) = \int\!\!\int_{\mathcal{I}^2} f\, \log(f/g)\,\mathrm{d}u\mathrm{d}v\mbox{,}

where the limits of integration \mathcal{I}^2 theoretically are closed on [0,1]^2 but an open interval (0,1)^2 might be needed for numerical integration. Note, in general \mathrm{KL}(f {\mid} g) \ne \mathrm{KL}(g {\mid} f). The \mathrm{KL}(f {\mid} g) is the expected log-likelihood ratios of g to f when g is the true density (Joe, 2014, p. 234), whereas \mathrm{KL}(g {\mid} f) is the opposite.

This asymmetry leads to Jeffrey Divergence, which is defined as a symmetrized version of the two Kullback–Leibler Divergences, and is

J(f,g) = \mathrm{KL}(f {\mid} g) + \mathrm{KL}(g {\mid} f) = \int\!\!\int_{\mathcal{I}^2} (g-f)\, \log(g/f)\,\mathrm{d}u\mathrm{d}v\mbox{.}

The variances of the Kullback–Leibler Divergences are defined as

\sigma^2_{\mathrm{KL}(f {\mid} g)} = \int\!\!\int_{\mathcal{I}^2} g\,[\log(g/f)]^2\,\mathrm{d}u\mathrm{d}v - [\mathrm{KL}(f|g)]^2\mbox{,}

and

\sigma^2_{\mathrm{KL}(g {\mid} f)} = \int\!\!\int_{\mathcal{I}^2} f\,[\log(f/g)]^2\,\mathrm{d}u\mathrm{d}v - [\mathrm{KL}(g|f)]^2\mbox{.}

For comparison of copula families f and g and taking an \alpha = 0.05, the Kullback–Leibler sample size is defined as

n_{f\!g} = \bigl[\Phi^{(-1)}(1-\alpha) \times \eta_\mathrm{KL}\bigr]^2\mbox{,}

where \Phi^{(-1)}(t) is the quantile function for the standard normal distribution \sim N(0,1) for nonexceedance probability t, and \eta_\mathrm{KL} is the maximum of

\eta_\mathrm{KL} = \mathrm{max}\bigl[\sigma_{\mathrm{KL}(f {\mid} g)}/\mathrm{KL}(f {\mid} g),\, \sigma_{\mathrm{KL}(g {\mid} f)}/\mathrm{KL}(g {\mid} f)\bigr]\mbox{.}

The n_{f\!g} gives an indication of the sample size needed to distinguish f and g with a probability of at least 1 - \alpha = 1 - 0.05 = 0.95 or 95 percent.

The copBasic features a naïve Monte Carlo integration scheme in the primary interface kullCOP, although the function kullCOPint provides for nested numerical integration. This later function is generally fast but suffers too much for general application from integral divergencies issued from the integrate() function in R—this must be judged in the light that the copBasic package focuses only on implementation of the function of the copula itself and numerical estimation of copula density (densityCOP) and not analytical copula densities or hybrid representations thereof. Sufficient “bread crumbs” are left among the code and documentation for users to re-implement if speed is paramount. Numerical comparison to the results of Joe (2014) (see Examples) suggests that the default Monte Carlo sample size should be more than sufficient for general inference with the expense of considerable CPU time; however, a couple of repeated calls of kullCOP would be advised and compute say the mean of the resulting sample sizes.

Usage

kullCOP(cop1=NULL, cop2=NULL, para1=NULL, para2=NULL, alpha=0.05,
           del=0, n=1E5, verbose=TRUE, sobol=FALSE, scrambling=0, ...)

kullCOPint(cop1=NULL, cop2=NULL, para1=NULL, para2=NULL, alpha=0.05,
           del=.Machine$double.eps^0.25, verbose=TRUE, ...)

Arguments

`cop1`	A copula function corresponding to copula `f` in Joe (2014);
`para1`	Vector of parameters or other data structure, if needed, to pass to the copula `f`;
`cop2`	A copula function corresponding to copula `g` in Joe (2014);
`para2`	Vector of parameters or other data structure, if needed, to pass to the copula `g`;
`alpha`	The `\alpha` in the Kullback–Leibler sample size equation;
`del`	A small value used to denote the `lo` and `hi` values of the numerical integration: `lo = del` and `hi = 1 - del`. If `del == 0`, then `lo = 0` and `hi = 1`, which corresponds to the theoretical limits `\mathcal{I}^2 = [0,1]^2` and are defaulted here to `[0,1]^2` because the Monte Carlo algorithm is preferred for general application. The end point control, however, is maintained just in case pathological situations should arise;
`n`	`kullCOP` (Monte Carlo integration) only—the Monte Carlo integration simulation size;
`verbose`	A logical trigging a couple of status lines of output through the `message()` function in R;
`sobol`	A logical trigging Sobol sequences for the Monte Carlo integration instead of the bivariate uniform distribution. The Sobol sequences are dependent on the randtoolbox package and the `sobol()` function of the randtoolbox package, and the Sobol sequences canvas the `\mathcal{I}^2` domain for smaller `n` values than required if statistical independence is used for the Monte Carlo integration. Note, the randtoolbox at least at version 2.0.+ has “scrambling” of Sobol sequences temporarily disabled, and hence `scrambling=0` as default for `kullCOP`;
`scrambling`	The argument of the same name for `randtoolbox::sobol`; and
`...`	Additional arguments to pass to the `densityCOP` function.

Value

An R list is returned having the following components:

`MonteCarlo.sim.size`	`kullCOP` (Monte Carlo integration) only—The simulation size for numerical integration;
`divergences`	A vector of the Kullback–Leibler Divergences and their standard deviations: `\mathrm{KL}(f {\mid} g)`, `\sigma_{\mathrm{KL}(f {\mid} g)}`, `\mathrm{KL}(g {\mid} f)`, and `\sigma_{\mathrm{KL}(g {\mid} f)}`, respectively;
`stdev.divergences`	`kullCOP` (Monte Carlo integration) only—The standard deviation of the divergences and the variances;
`Jeffrey.divergence`	Jeffrey Divergence `J(f,g)`;
`KL.sample.size`	Kullback–Leibler sample size `n_{f\!g}`; and
`integrations`	`kullCOPint` (numerical integration) only—An R `list` of the outer call of the `integrate()` function for the respective numerical integrals shown in this documentation.

Author(s)

W.H. Asquith

References

Joe, H., 2014, Dependence modeling with copulas: Boca Raton, CRC Press, 462 p.

Examples

# See another demonstration under the Note section of statTn().
## Not run: 
# Joe (2014, p. 237, table 5.2)
# Gumbel-Hougaard and Plackett copulas below each have a Kendall Tau of about 0.5, and
# Joe (2014) lists in the table that Jeffrey Divergence is about 0.110 and Kullback-Leibler
# sample size is 133. Joe (2014) does not list the copula parameters just says Tau = 0.5.
# Joe (2014) likely did the numerical integrations using analytical solutions to probability
# densities and not rectangular approximations as in copBasic::densityCOP().
set.seed(1)
KL <- kullCOP(cop1=GHcop,       para1=2,
              cop2=PLACKETTcop, para2=11.40484, sobol=FALSE)
message("Jeffery Divergence is ",          round(KL$Jeffrey.divergence, digits=4),
        " and Kullback-Leibler sample size is ", KL$KL.sample.size, ".")
# Jeffery Divergence is 0.1106 and Kullback-Leibler sample size is 137.
set.seed(1)
KL <- kullCOP(cop1=GHcop,       para1=2,
              cop2=PLACKETTcop, para2=11.40484, sobol=TRUE )
message("Jeffery Divergence is ",          round(KL$Jeffrey.divergence, digits=4),
        " and Kullback-Leibler sample size is ", KL$KL.sample.size, ".")
# Jeffery Divergence is 0.3062 and Kullback-Leibler sample size is 136.


set.seed(1)
S <- replicate(20, kullCOP(cop1=GHcop, para1=2, cop2=PLACKETTcop, sobol=FALSE,
                           para2=11.40484, verbose=FALSE)$KL.sample.size)
print(as.integer(c(mean(S), sd(S)))) # 132 plus/minus 5
S <- replicate(2 , kullCOP(cop1=GHcop, para1=2, cop2=PLACKETTcop,  sobol=TRUE,
                           para2=11.40484, verbose=FALSE)$KL.sample.size)
# The two S in the later replication are both the same (136) for a sobol=TRUE
# does not produce variation and this is thought (June 2023) as a result
# of the disabled scrambling in the randtoolbox::sobol() function. 
## End(Not run)

## Not run: 
# Joe (2014, p. 237, table 5.3)
# Gumbel-Hougaard and Plackett copulas below each have a Spearman Rho of about 0.5, and
# Joe (2014) lists in the table that Jeffrey Divergence is about 0.063 and Kullback-Leibler
# sample size is 210. Joe (2014) does not list the parameters and just says that Rho = 0.5.
# Joe (2014) likely did the numerical integrations using analytical solutions to probability
# densities and not rectangular approximations as in copBasic::densityCOP().
set.seed(1)
KL <- kullCOP(cop1=GHcop,       para1=1.541071,
              cop2=PLACKETTcop, para2=5.115658, sobol=FALSE)
message("Jeffery Divergence is ",          round(KL$Jeffrey.divergence, digits=4),
        " and Kullback-Leibler sample size is ", KL$KL.sample.size, ".")
# Jeffery Divergence is 0.0642 and Kullback-Leibler sample size is 213.
set.seed(1)
KL <- kullCOP(cop1=GHcop,       para1=1.541071,
              cop2=PLACKETTcop, para2=5.115658, sobol=TRUE )
message("Jeffery Divergence is ",          round(KL$Jeffrey.divergence, digits=4),
        " and Kullback-Leibler sample size is ", KL$KL.sample.size, ".")
# Jeffery Divergence is 0.2001 and Kullback-Leibler sample size is 206.


set.seed(1)
S <- replicate(20, kullCOP(cop1=GHcop, para1=1.541071, cop2=PLACKETTcop,
                           para2=5.115658, verbose=FALSE)$KL.sample.size)
print(as.integer(c(mean(S), sd(S))))  # 220 plus/minus 19 
## End(Not run)

## Not run: 
# Compare Jeffery Divergence estimates as functions of sample size when computed
# using Sobol sequences or not for Gumbel-Hougaard and Pareto copulas.
GHpar <- PApar <- 2 # Spearman Rho = 0.6822339
Ns <- as.integer(10^c(seq(2.0, 3.5, by=0.01), seq(3.6, 5, by=0.05)))
JDuni <- sapply(1:length(Ns), function(i) {
                  kullCOP(cop1=GHcop, para1=GHpar, verbose=FALSE,
                          cop2=PAcop, para2=PApar, n=Ns[i],
                          sobol=FALSE)$Jeffrey.divergence })
JDsob <- sapply(1:length(Ns), function(i) {
                  kullCOP(cop1=GHcop, para1=GHpar, verbose=FALSE,
                          cop2=PAcop, para2=PApar, n=Ns[i],
                          sobol=TRUE )$Jeffrey.divergence })
plot(Ns, JDuni, type="l", log="x", # black line, notice likely outliers too
     xlab="Simulation Sample Size", ylab="Jeffery Divergence")
lines(Ns, JDsob, col="red") # red line
legend("topright", c("Monte Carlo", "Sobol sequence"),
                   lwd=c(1,1), col=c("black", "red"), bty="n")
print( c( mean(JDuni), sd(JDuni) ) ) # [1] 0.05915608 0.01284682
print( c( mean(JDsob), sd(JDsob) ) ) # [1] 0.07274190 0.01838939

# The developer notes that plotting KL.sample.size for sobol=TRUE shows
# what appears to be numerical blow up but the Jeffery Divergence does not.
KLuni <- sapply(1:length(Ns), function(i) {
                  kullCOP(cop1=GHcop, para1=GHpar, verbose=FALSE,
                          cop2=PAcop, para2=PApar, n=Ns[i],
                          sobol=FALSE)$KL.sample.size })
KLsob <- sapply(1:length(Ns), function(i) {
                  kullCOP(cop1=GHcop, para1=GHpar, verbose=FALSE,
                          cop2=PAcop, para2=PApar, n=Ns[i],
                          sobol=TRUE )$KL.sample.size })
plot(Ns, KLuni, type="l", log="xy", # black line, notice likely outliers too
     xlab="Simulation Sample Size", ylab="Kullback-Leibler Sample Size")
lines(Ns, KLsob, col="red") # red line
nideal <- kullCOPint(cop1=GHcop, para1=GHpar, cop2=PAcop, para2=PApar)$KL.sample.size
abline(h=nideal, col="green", lwd=3) # nideal sample size is about 210
legend("topright", c("Monte Carlo", "Sobol sequence", "Judged ideal sample size"),
                   lwd=c(1,1,3), col=c("black", "red", "green"), bty="n")

# Let us attempt a visualization to highlight the differences in the two copula by
# simulation. First, using this n = nideal, being the apparent sample size to distinguish
# generally between the two copula having the same Spearman Rho. Do the segments help
# to visually highlight the differences? Next, ask would one judge the parents in the
# simulation being different knowing same Spearman Rho? (Note, the segments are all
# vertical because the U axis is the simulation and the V axis is the conditional
# probability given the U.)
set.seed(1); UVgh <- simCOP(nideal, GHcop, para=GHpar, graphics=FALSE)
set.seed(1); UVpa <- simCOP(nideal, PAcop, para=PApar, graphics=FALSE)
plot(c(0,1), c(0,1), type="n", xlab="U, nonexceedance probability",
                               ylab="V, nonexceedance probability")
segments(UVgh[,1], UVgh[,2], x1=UVpa[,1], y1=UVpa[,2])
points(UVgh, col="lightgreen", pch=16, cex=0.8) # dots
points(UVpa, col="darkgreen",  pch= 1, lwd=0.8) # circles
# Repeat the above n = nideal visualizations but with a change to n = nideal*10, and see
# then that there are visually striking shifts systematically in both both tails but also
# in the U in the interval (0.3, 0.7) belt but to a smaller degree than seen in the tails. 
## End(Not run)

wasquith/copBasic documentation built on Feb. 17, 2025, 11:39 a.m.