CKT.hCV.l1out: Choose the bandwidth for kernel estimation of conditional...

CKT.hCV.l1outR Documentation

Choose the bandwidth for kernel estimation of conditional Kendall's tau using cross-validation

Description

Let X_1 and X_2 be two random variables. The goal here is to estimate the conditional Kendall's tau (a dependence measure) between X_1 and X_2 given Z=z for a conditioning variable Z. Conditional Kendall's tau between X_1 and X_2 given Z=z is defined as:

P( (X_{1,1} - X_{2,1})(X_{1,2} - X_{2,2}) > 0 | Z_1 = Z_2 = z)

- P( (X_{1,1} - X_{2,1})(X_{1,2} - X_{2,2}) < 0 | Z_1 = Z_2 = z),

where (X_{1,1}, X_{1,2}, Z_1) and (X_{2,1}, X_{2,2}, Z_2) are two independent and identically distributed copies of (X_1, X_2, Z). For this, a kernel-based estimator is used, as described in (Derumigny & Fermanian (2019)). These functions aims at finding the best bandwidth h among a given range_h by cross-validation. They use either:

  • leave-one-out cross-validation: function CKT.hCV.l1out

  • or K-folds cross-validation: function CKT.hCV.Kfolds

Usage

CKT.hCV.l1out(
  X1 = NULL,
  X2 = NULL,
  Z = NULL,
  range_h,
  matrixSignsPairs = NULL,
  nPairs = 10 * length(X1),
  typeEstCKT = "wdm",
  kernel.name = "Epa",
  progressBar = TRUE,
  verbose = FALSE,
  observedX1 = NULL,
  observedX2 = NULL,
  observedZ = NULL
)

CKT.hCV.Kfolds(
  X1,
  X2,
  Z,
  ZToEstimate,
  range_h,
  matrixSignsPairs = NULL,
  typeEstCKT = "wdm",
  kernel.name = "Epa",
  Kfolds = 5,
  progressBar = TRUE,
  verbose = FALSE,
  observedX1 = NULL,
  observedX2 = NULL,
  observedZ = NULL
)

Arguments

X1

a vector of n observations of the first variable

X2

a vector of n observations of the second variable

Z

vector of observed values of Z. If Z is multivariate, then this is a matrix whose rows correspond to the observations of Z

range_h

vector containing possible values for the bandwidth.

matrixSignsPairs

square matrix of signs of all pairs, produced by computeMatrixSignPairs(observedX1, observedX2). Only needed if typeEstCKT is not the default 'wdm'.

nPairs

number of pairs used in the cross-validation criteria.

typeEstCKT

type of estimation of the conditional Kendall's tau.

kernel.name

name of the kernel used for smoothing. Possible choices are "Gaussian" (Gaussian kernel) and "Epa" (Epanechnikov kernel).

progressBar

if TRUE, a progressbar for each h is displayed to show the progress of the computation.

verbose

if TRUE, print the score of each h during the procedure.

observedX1, observedX2, observedZ

old parameter names for X1, X2, Z. Support for this will be removed at a later version.

ZToEstimate

vector of fixed conditioning values at which the difference between the two conditional Kendall's tau should be computed. Can also be a matrix whose lines are the conditioning vectors at which the difference between the two conditional Kendall's tau should be computed.

Kfolds

number of subsamples used.

Value

Both functions return a list with two components:

  • hCV: the chosen bandwidth

  • scores: vector of the same length as range_h giving the value of the CV criteria for each of the h tested. Lower score indicates a better fit.

References

Derumigny, A., & Fermanian, J. D. (2019). On kernel-based estimation of conditional Kendall’s tau: finite-distance bounds and asymptotic behavior. Dependence Modeling, 7(1), 292-321. Page 296, Equation (4). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1515/demo-2019-0016")}

See Also

CKT.kernel for the corresponding estimator of conditional Kendall's tau by kernel smoothing.

Examples

# We simulate from a conditional copula
set.seed(1)
N = 200
Z = rnorm(n = N, mean = 5, sd = 2)
conditionalTau = -0.9 + 1.8 * pnorm(Z, mean = 5, sd = 2)
simCopula = VineCopula::BiCopSim(N=N , family = 1,
    par = VineCopula::BiCopTau2Par(1 , conditionalTau ))
X1 = qnorm(simCopula[,1])
X2 = qnorm(simCopula[,2])

newZ = seq(2,10,by = 0.1)
range_h = 3:10

resultCV <- CKT.hCV.l1out(X1 = X1, X2 = X2, Z = Z,
                          range_h = range_h, nPairs = 100)

resultCV <- CKT.hCV.Kfolds(X1 = X1, X2 = X2, Z = Z,
                           range_h = range_h, ZToEstimate = newZ)

plot(range_h, resultCV$scores, type = "b")


CondCopulas documentation built on Sept. 11, 2024, 9:10 p.m.