kendall.tau: Kendall's Tau Statistic

Description Usage Arguments Details Value Warning See Also Examples

View source: R/family.bivariate.R View source: R/family.bivariate.R

Description

Computes Kendall's Tau, which is a rank-based correlation measure, between two vectors.

Usage

1
kendall.tau(x, y, exact = FALSE, max.n = 3000)

Arguments

x, y

Numeric vectors. Must be of equal length. Ideally their values are continuous and not too discrete. Let length(x) be N, say.

exact

Logical. If TRUE then the exact value is computed.

max.n

Numeric. If exact = FALSE and length(x) is more than max.n then a random sample of max.n pairs are chosen.

Details

Kendall's tau is a measure of dependency in a bivariate distribution. Loosely, two random variables are concordant if large values of one random variable are associated with large values of the other random variable. Similarly, two random variables are disconcordant if large values of one random variable are associated with small values of the other random variable. More formally, if (x[i] - x[j])*(y[i] - y[j]) > 0 then that comparison is concordant (i \neq j). And if (x[i] - x[j])*(y[i] - y[j]) < 0 then that comparison is disconcordant (i \neq j). Out of choose(N, 2) comparisons, let c and d be the number of concordant and disconcordant pairs. Then Kendall's tau can be estimated by (c-d)/(c+d). If there are ties then half the ties are deemed concordant and half disconcordant so that (c-d)/(c+d+t) is used.

Value

Kendall's tau, which lies between -1 and 1.

Warning

If length(x) is large then the cost is O(N^2), which is expensive! Under these circumstances it is not advisable to set exact = TRUE or max.n to a very large number.

See Also

binormalcop, cor.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
N <- 5000; x <- 1:N; y <- runif(N)
true.rho <- -0.8
ymat <- rbinorm(N, cov12 =  true.rho)  # Bivariate normal, aka N_2
x <- ymat[, 1]
y <- ymat[, 2]

## Not run: plot(x, y, col = "blue")

kendall.tau(x, y)  # A random sample is taken here
kendall.tau(x, y)  # A random sample is taken here

kendall.tau(x, y, exact = TRUE)  # Costly if length(x) is large
kendall.tau(x, y, max.n = N)     # Same as exact = TRUE

(rhohat <- sin(kendall.tau(x, y) * pi / 2))  # This formula holds for N_2 actually
true.rho  # rhohat should be near this value

Example output

Loading required package: stats4
Loading required package: splines
[1] -0.5927389
[1] -0.5857677
[1] -0.5864032
[1] -0.5864032
[1] -0.7931184
[1] -0.8

VGAM documentation built on Jan. 16, 2021, 5:21 p.m.