mded: Measuring the difference between two empirical distributions

Description Usage Arguments Details Value Author(s) References Examples

Description

The function measures the difference between two independent or non-independent empirical distributions and returns a significance level of the difference.

Usage

1
2
3
4
mded(distr1, distr2, detail = FALSE, independent = TRUE)

## S3 method for class 'mded'
print(x, digits = max(3, getOption("digits") - 3), ...)

Arguments

distr1

A vector of empirical distribution. distr1 is greater than distr2.

distr2

A vector of empirical distribution.

detail

If TRUE, a vector of the difference between distr1 and distr2 is returned.

independent

Set as FALSE when distr1 and distr2 are not independent of each other.

x

An object of S3 class 'mded.'

digits

A number of significant digits.

...

Arguments passed to the function print.

Details

The function measures the difference between two independent or non-independent empirical distributions and returns a significance level of the difference on the basis of the methods proposed by Poe et al. (1997, 2005). Such calculations are frequently needed in empirical econometric studies wherein (marginal) willingness-to-pay distributions that are estimated using contingent valuation methods or discrete choice experiments have to be compared to each other.

Let us assume that X and Y are empirical distributions, which are depicted by the vector x = (x1, x2, ..., xm), and y = (y1, y2, ..., yn). The null hypothesis (H0) is X - Y = 0, while the alternative hypothesis (H1) is X - Y > 0. When X and Y are independent of each other, the complete combinatorial method (Poe et al. 2005) provides the one-sided significance level of H0 that is calculated by #{xi - yj <= 0} / m * n, where #{cond} provides the number of times that cond is true. When X and Y are not independent of each other, the paird difference method (Poe et al. 1997) provides the one-sided significance level of H0 that is calculated by #{xi - yi <= 0} / m, where m is equal to n.

Note that the function may take quite long, and would require large amount of memory to calculate the difference between two independent distributions if the argument detail is set as TRUE because the resulting difference is stored as a vector. For example, when distr1 and distr2 each contain 10,000 elements (observations), the vector of the difference contains 100,000,000 elements. If memory is lacking, R would stop running the function, showing an error message related to memory limitaion.

Value

stat

One-side significance level of the difference between distr1 and distr2.

means

A vector of mean values of distr1 and distr2.

cases

A vector of integer values describing a number of cases wherein the cond is true and that is false.

distr1

A vector assigned to distr1.

distr2

A vector assigned to distr2.

distr.names

A vector of the names of objects assigned to distr1 and distr2.

diff

A vector of the difference. If detail = TRUE, it is returned.

Author(s)

Hideo Aizaki

References

Poe GL, Giraud KL, Loomis JB (2005). Computational methods for measuring the difference of empirical distributions. American Journal of Agricultural Economics, 87, 353–365.

Poe GL, Severance-Lossin EK, Welsh WP (1994). Measuring the difference (X - Y) of simulated distributions: A convolutions approach. American Journal of Agricultural Economics, 76, 904–915.

Poe GL, Welsh MP, Champ PA (1997). Measuring the difference in mean willingness to pay when dichotomous choice contingent valuation responses are not independent. Land Economics, 73, 255–267.

Examples

1
2
3
4
5
6
set.seed(123)
x <- rnorm(100, 3)
y <- rnorm(100, 1)

out <- mded(distr1 = x, distr2 = y, detail = TRUE)
out

Example output

Test:
H0  x = y 
H1  x > y 
significance level = 0.054 

Data:
distr1 = x 
distr2 = y 

Means:
    means    n
x  3.0904  100
y  0.8925  100

Cases in the difference:
           n
true     540
false   9460
total  10000

mded documentation built on May 1, 2019, 9:22 p.m.