# mixed_ks_test: Computes the p-value for a one-sample two-sided... In KSgeneral: Computing P-Values of the K-S Test for (Dis)Continuous Null Distribution

## Description

Computes the p-value P(D_{n} ≥ d_{n}), where d_{n} is the value of the KS test statistic computed based on a data sample \{x_{1}, ..., x_{n}\}, when F(x) is mixed, using the Exact-KS-FFT method expressing the p-value as a double-boundary non-crossing probability for a homogeneous Poisson process, which is then efficiently computed using FFT (see Dimitrova, Kaishev, Tan (2020)).

## Usage

 `1` ```mixed_ks_test(x, jump_points, Mixed_dist, ..., tol = 1e-10) ```

## Arguments

 `x` a numeric vector of data sample values \{x_{1}, ..., x_{n}\}. `jump_points` a numeric vector containing the points of (jump) discontinuity, i.e. where the underlying cdf F(x) has jump(s) `Mixed_dist` a pre-specified (user-defined) mixed cdf, F(x), under the null hypothesis. `...` values of the parameters of the cdf, F(x) specified (as a character string) by `Mixed_dist`. `tol` the value of ε that is used to compute the values of A_{i} and B_{i}, i = 1, ..., n, as detailed in Step 1 of Section 2.1 in Dimitrova, Kaishev and Tan (2020) (see also (ii) in the Procedure Exact-KS-FFT therein). By default, `tol = 1e-10`. Note that a value of `NA` or `0` will lead to an error!

## Details

Given a random sample \{X_{1}, ..., X_{n}\} of size `n` with an empirical cdf F_{n}(x), the Kolmogorov-Smirnov goodness-of-fit statistic is defined as D_{n} = \sup | F_{n}(x) - F(x) | , where F(x) is the cdf of a prespecified theoretical distribution under the null hypothesis H_{0}, that \{X_{1}, ..., X_{n}\} comes from F(x).

The function `mixed_ks_test` implements the Exact-KS-FFT method expressing the p-value as a double-boundary non-crossing probability for a homogeneous Poisson process, which is then efficiently computed using FFT (see Dimitrova, Kaishev, Tan (2020)). This algorithm ensures a total worst-case run-time of order O(n^{2}log(n)).

The function `mixed_ks_test` computes the p-value P(D_{n} ≥ d_{n}), where d_{n} is the value of the KS test statistic computed based on a user-provided data sample \{x_{1}, ..., x_{n}\}, when F(x) is mixed,

We have not been able to identify alternative, fast and accurate, method (software) that has been developed/implemented when the hypothesized F(x) is mixed.

## Value

A list with class "htest" containing the following components:

 `statistic ` the value of the statistic. `p.value ` the p-value of the test. `alternative ` "two-sided". `data.name ` a character string giving the name of the data.

## References

Dimitrina S. Dimitrova, Vladimir K. Kaishev, Senren Tan. (2020) "Computing the Kolmogorov-Smirnov Distribution When the Underlying CDF is Purely Discrete, Mixed or Continuous". Journal of Statistical Software, 95(10): 1-42. doi:10.18637/jss.v095.i10.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66``` ```# Example to compute the p-value of the one-sample two-sided KS test, # when the underlying distribution is a mixed distribution # with two jumps at 0 and log(2.5), # as in Example 3.1 of Dimitrova, Kaishev, Tan (2020) # Defining the mixed distribution Mixed_cdf_example <- function(x) { result <- 0 if (x < 0){ result <- 0 } else if (x == 0){ result <- 0.5 } else if (x < log(2.5)){ result <- 1 - 0.5 * exp(-x) } else{ result <- 1 } return (result) } test_data <- c(0,0,0,0,0,0,0.1,0.2,0.3,0.4, 0.5,0.6,0.7,0.8,log(2.5),log(2.5), log(2.5),log(2.5),log(2.5),log(2.5)) KSgeneral::mixed_ks_test(test_data, c(0, log(2.5)), Mixed_cdf_example) ## Compute the p-value of a two-sided K-S test ## when F(x) follows a zero-and-one-inflated ## beta distribution, as in Example 3.3 ## of Dimitrova, Kaishev, Tan (2020) ## The data set is the proportion of inhabitants ## living within a 200 kilometer wide costal strip ## in 232 countries in the year 2010 data("Population_Data") mu <- 0.6189 phi <- 0.6615 a <- mu * phi b <- (1 - mu) * phi Mixed_cdf_example <- function(x) { result <- 0 if (x < 0){ result <- 0 } else if (x == 0){ result <- 0.1141 } else if (x < 1){ result <- 0.1141 + 0.4795 * pbeta(x, a, b) } else{ result <- 1 } return (result) } KSgeneral::mixed_ks_test(Population_Data, c(0, 1), Mixed_cdf_example) ```

KSgeneral documentation built on Jan. 13, 2021, 1:06 p.m.