ks_test: Weighted KS Test

ks_testR Documentation

Weighted KS Test

Description

Weighted Kolmogorov-Smirnov Two-Sample Test with threshold

Usage

ks_test(x, y, thresh = 0.05, w_x = rep(1, length(x)), w_y = rep(1, length(y)))

Arguments

x

Vector of values sampled from the first distribution

y

Vector of values sampled from the second distribution

thresh

The threshold needed to clear between the two cumulative distributions

w_x

The observation weights for x

w_y

The observation weights for y

Details

The usual Kolmogorov-Smirnov test for two vectors X and Y, of size m and n rely on the empirical cdfs E_x and E_y and the test statistic

D = sup_{t\in (X, Y)} |E_x(x) - E_y(x))

. This modified Kolmogorov-Smirnov test relies on two modifications.

  • Using observation weights for both vectors X and Y: Those weights are used in two places, while modifying the usual KS test. First, the empirical cdfs are updates to account for the weights. Secondly, the effective sample sizes are also modified. This is inspired from https://stackoverflow.com/a/55664242/13768995, using Monahan (2011).

  • Testing against a threshold: the test statistic is thresholded such that D = max(D - thresh, 0). Since 0≤ D≤ 1, the value of the threshold is also between 0 and 1, representing an effect size for the difference.

Value

A list with class "htest" containing the following components:

  • statistic the value of the test statistic.

  • p.value the p-value of the test.

  • alternative a character string describing the alternative hypothesis.

  • method a character string indicating what type of test was performed.

  • data.name a character string giving the name(s) of the data.

References

Monahan, J. (2011). Numerical Methods of Statistics (2nd ed., Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511977176

Examples

 x <- runif(100)
 y <- runif(100, min = .5, max = .5)
 ks_test(x, y, thresh = .001)

HectorRDB/Ecume documentation built on June 21, 2022, 7:13 a.m.