# mixed_ks_c_cdf: Computes the complementary cumulative distribution function... In KSgeneral: Computing P-Values of the K-S Test for (Dis)Continuous Null Distribution

## Description

Computes the complementary cdf, P(D_{n} ≥ q) at a fixed q, q\in[0, 1], of the one-sample two-sided Kolmogorov-Smirnov statistic, when the cdf F(x) under the null hypothesis is mixed, using the Exact-KS-FFT method expressing the p-value as a double-boundary non-crossing probability for a homogeneous Poisson process, which is then efficiently computed using FFT (see Dimitrova, Kaishev, Tan (2020)).

## Usage

 1 mixed_ks_c_cdf(q, n, jump_points, Mixed_dist, ..., tol = 1e-10) 

## Arguments

 q numeric value between 0 and 1, at which the complementary cdf P(D_{n} ≥ q) is computed n the sample size jump_points a numeric vector containing the points of (jump) discontinuity, i.e. where the underlying cdf F(x) has jump(s) Mixed_dist a pre-specified (user-defined) mixed cdf, F(x), under the null hypothesis. ... values of the parameters of the cdf, F(x) specified (as a character string) by Mixed_dist. tol the value of ε that is used to compute the values of A_{i} and B_{i}, i = 1, ..., n, as detailed in Step 1 of Section 2.1 in Dimitrova, Kaishev and Tan (2020) (see also (ii) in the Procedure Exact-KS-FFT therein). By default, tol = 1e-10. Note that a value of NA or 0 will lead to an error!

## Details

Given a random sample \{X_{1}, ..., X_{n}\} of size n with an empirical cdf F_{n}(x), the Kolmogorov-Smirnov goodness-of-fit statistic is defined as D_{n} = \sup | F_{n}(x) - F(x) | , where F(x) is the cdf of a prespecified theoretical distribution under the null hypothesis H_{0}, that \{X_{1}, ..., X_{n}\} comes from F(x).

The function mixed_ks_c_cdf implements the Exact-KS-FFT method, proposed by Dimitrova, Kaishev, Tan (2020) to compute the complementary cdf P(D_{n} ≥ q) at a value q, when F(x) is mixed. This algorithm ensures a total worst-case run-time of order O(n^{2}log(n)).

We have not been able to identify alternative, fast and accurate, method (software) that has been developed/implemented when the hypothesized F(x) is mixed.

## Value

Numeric value corresponding to P(D_{n} ≥ q).

## References

Dimitrina S. Dimitrova, Vladimir K. Kaishev, Senren Tan. (2020) "Computing the Kolmogorov-Smirnov Distribution When the Underlying CDF is Purely Discrete, Mixed or Continuous". Journal of Statistical Software, 95(10): 1-42. doi:10.18637/jss.v095.i10.

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 # Compute the complementary cdf of D_{n} # when the underlying distribution is a mixed distribution # with two jumps at 0 and log(2.5), # as in Example 3.1 of Dimitrova, Kaishev, Tan (2020) ## Defining the mixed distribution Mixed_cdf_example <- function(x) { result <- 0 if (x < 0){ result <- 0 } else if (x == 0){ result <- 0.5 } else if (x < log(2.5)){ result <- 1 - 0.5 * exp(-x) } else{ result <- 1 } return (result) } KSgeneral::mixed_ks_c_cdf(0.1, 25, c(0, log(2.5)), Mixed_cdf_example) ## Not run: ## Compute P(D_{n} >= q) for n = 5, ## q = 1/5000, 2/5000, ..., 5000/5000 ## when the underlying distribution is a mixed distribution ## with four jumps at 0, 0.2, 0.8, 1.0, ## as in Example 2.8 of Dimitrova, Kaishev, Tan (2020) n <- 5 q <- 1:5000/5000 Mixed_cdf_example <- function(x) { result <- 0 if (x < 0){ result <- 0 } else if (x == 0){ result <- 0.2 } else if (x < 0.2){ result <- 0.2 + x } else if (x < 0.8){ result <- 0.5 } else if (x < 1){ result <- x - 0.1 } else{ result <- 1 } return (result) } plot(q, sapply(q, function(x) KSgeneral::mixed_ks_c_cdf(x, n, c(0, 0.2, 0.8, 1.0), Mixed_cdf_example)), type='l') ## End(Not run) 

KSgeneral documentation built on Jan. 13, 2021, 1:06 p.m.