cdfDist: Distance measure for cumulative distribution functions

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/cdfDist.R

Description

This distance measure is useful in assessing the dissimilarity in two cumulative distribution functions, if differences in the right tail are of particular interest.

Usage

1
cdfDist(x1, F1, x2, F2)

Arguments

x1

A vector of numerical values.

F1

A vector of numerical values, where the i-th elementh of F1 is the CDF at value x1[i].

x2

A vector of numerical values.

F2

A vector of numerical values, where the i-th elementh of F2 is the CDF at value x2[i].

Details

This function first computes a pointwise distance at each value x as

D(x) = (F1(x) - F2(x))^2 / (1 - min(F1(x), F2(x)))

The measure is equal to the integral of this distance over the intersection of the provided quantiles of the two CDFs, a region (m1, m2). Finally, the measure is standardized by the distane of this range:

μ(F1, F2) = Int_m1^m2 D(x) dx / (m2-m1)

This measure was designed to penalize heavily if the right tails of the distributions were very dissimilar. A poor match in the lower tail results in only a slight increase of the measure.

The functions print, plot, and summary may be applied to the output of cdfDist.

Value

The output is a list of class "cdfDist":

x

The values at which the pointwise distance was computed and then integrated over.

F1

The first CDF for each value of x.

F2

The second CDF for each value of x.

meas

A vector representing the integral of the pointwise distance from x[1] up to each value of x. Plotting this measure with x makes it easy to see where the distance grew the fastest between the CDFs.

cdfDist

The distance between the CDFs.

Author(s)

David M Diez

See Also

cor2icc, apple, peach, pear, pepper

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
par(mfrow=c(2,2))

#=====> Example 1 <=====#
F1   <- seq(0.001, 0.999, 0.001)[-sample(999, 300)]
x1   <- quantile(rt(10000, 15), F1)
F2   <- seq(0.001, 0.999, 0.001)[-sample(999, 300)]
x2   <- qnorm(F2)
hold <- cdfDist(x1, F1, x2, F2)
plot(hold)
summary(hold)

#=====> Example 2 <=====#
F1   <- seq(0.001, 0.999, 0.001)[-sample(999, 300)]
x1   <- exp(quantile(rnorm(10000, 1, sd=1), F1))
F2   <- seq(0.001, 0.999, 0.001)[-sample(999, 300)]
x2   <- qchisq(F2, mean(x1))
hold <- cdfDist(x1, F1, x2, F2)
plot(hold)
summary(hold)

#=====> Example 3 <=====#
F1   <- seq(0.001, 0.999, 0.001)[-sample(999, 300)]
x1   <- exp(quantile(rnorm(10000, 0.5, sd=0.5), F1))
F2   <- seq(0.001, 0.999, 0.001)[-sample(999, 300)]
x2   <- qchisq(F2, mean(x1))
hold <- cdfDist(x1, F1, x2, F2)
plot(hold)
summary(hold)

#=====> Example 4 <=====#
F1   <- seq(0.001, 0.999, 0.001)[-sample(999, 300)]
x1   <- exp(quantile(rnorm(10000, 0.5, sd=0.5), F1))
F2   <- seq(0.001, 0.999, 0.001)[-sample(999, 300)]
x2   <- qchisq(F2, mean(x1)+1)
hold <- cdfDist(x1, F1, x2, F2)
plot(hold)
summary(hold)

pesticides documentation built on May 30, 2017, 7:19 a.m.