Description Usage Arguments Details Value Source References See Also Examples
Perform a Kolmogorov-Smirnov (one-sample) or Smirnov (two-sample) test.
1 2 3 4 5 6 7 8 9 10 11 | ks.test(x, ...)
## Default S3 method:
ks.test(x, y, ...,
alternative = c("two.sided", "less", "greater"),
exact = NULL)
## S3 method for class 'formula'
ks.test(formula, data, subset, na.action, ...)
psmirnov(q, n.x, n.y = length(obs) - n.x, obs = NULL,
two.sided = TRUE, exact = TRUE, lower.tail = TRUE,
log.p = FALSE)
qsmirnov(p, n.x, n.y, two.sided = TRUE, exact = TRUE, ...)
|
x |
a numeric vector of data values. |
y |
either a numeric vector of data values, or a character string
naming a cumulative distribution function or an actual cumulative
distribution function such as |
... |
parameters of the distribution specified (as a character
string) by |
alternative |
indicates the alternative hypothesis and must be
one of |
exact |
|
formula |
a formula of the form |
data |
an optional matrix or data frame (or similar: see
|
subset |
an optional vector specifying a subset of observations to be used. |
na.action |
a function which indicates what should happen when
the data contain |
q |
a numeric vector of quantiles. |
p |
a numeric vector of probabilities. |
n.x |
length of |
n.y |
length of |
obs |
a numeric vector of all data values ( |
two.sided |
a logical indicating whether absolute ( |
lower.tail |
a logical, if |
log.p |
a logical, if |
If y
is numeric, a two-sample Smirnov test of the null hypothesis
that x
and y
were drawn from the same distribution is performed.
Alternatively, y
can be a character string naming a continuous
(cumulative) distribution function, or such a function. In this case,
a one-sample Kolmogorov-Smirnov test is carried out of the null that the distribution
function which generated x
is distribution y
with
parameters specified by ...
. The presence of ties always generates a
warning in the one-sided case, since continuous
distributions do not generate them. If the ties arose from rounding
the tests may be approximately valid, but even modest amounts of
rounding can have a significant effect on the calculated statistic.
Missing values are silently omitted from x
and (in the
two-sample case) y
.
The possible values "two.sided"
, "less"
and
"greater"
of alternative
specify the null hypothesis
that the true distribution function of x
is equal to, not less
than or not greater than the hypothesized distribution function
(one-sample case) or the distribution function of y
(two-sample
case), respectively. This is a comparison of cumulative distribution
functions, and the test statistic is the maximum difference in value,
with the statistic in the "greater"
alternative being
D^+ = max[F_x(u) - F_y(u)].
Thus in the two-sample case alternative = "greater"
includes
distributions for which x
is stochastically smaller than
y
(the CDF of x
lies above and hence to the left of that
for y
), in contrast to t.test
or
wilcox.test
.
Exact p-values are not available for the one-sided case in the presence of ties.
If exact = NULL
(the default), an
exact p-value is computed if the sample size is less than 100 in the
one-sample case and there are no ties, and if the product of
the sample sizes is less than 10000 in the two-sample case, with or
without ties (using the algorithm described in Schröer and Trenkler, 1995).
Otherwise, asymptotic distributions are used whose approximations may
be inaccurate in small samples. In the one-sample two-sided case,
exact p-values are obtained as described in Marsaglia, Tsang & Wang
(2003) (but not using the optional approximation in the right tail, so
this can be slow for small p-values). The formula of Birnbaum &
Tingey (1951) is used for the one-sample one-sided case.
If a one-sample Kolmogorov-Smirnov test is used, the parameters specified in
...
must be pre-specified and not estimated from the data.
There is some more refined distribution theory for the KS test with
estimated parameters (see Durbin, 1973), but that is not implemented
in ks.test
.
psmirnov
and qsmirnov
compute the distribution and quantile
function of the Smirnov test for two samples.
A list with class "htest"
containing the following components:
statistic |
the value of the test statistic. |
p.value |
the p-value of the test. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating what type of test was performed. |
data.name |
a character string giving the name(s) of the data. |
The two-sided one-sample distribution comes via Marsaglia, Tsang and Wang (2003).
Exact distributions for the two-sample Smirnov test are computed by the algorithm proposed by Schröer (1991) and Schröer and Trenkler (1995).
Vance W. Berger and YanYan Zhou (2014). Kolmogorov–Smirnov Test: Overview. In Wiley StatsRef: Statistics Reference Online (eds N. Balakrishnan, T. Colton, B. Everitt, W. Piegorsch, F. Ruggeri and J.L. Teugels). doi: 10.1002/9781118445112.stat06558.
Zygmunt W. Birnbaum and Fred H. Tingey (1951). One-sided confidence contours for probability distribution functions. The Annals of Mathematical Statistics, 22/4, 592–596. doi: 10.1214/aoms/1177729550.
William J. Conover (1971). Practical Nonparametric Statistics. New York: John Wiley & Sons. Pages 295–301 (one-sample Kolmogorov test), 309–314 (two-sample Smirnov test).
James Durbin (1973). Distribution theory for tests based on the sample distribution function. SIAM.
George Marsaglia, Wai Wan Tsang and Jingbo Wang (2003). Evaluating Kolmogorov's distribution. Journal of Statistical Software, 8/18. doi: 10.18637/jss.v008.i18.
Gunar Schröer and Dietrich Trenkler (1995). Exact and Randomization Distributions of Kolmogorov-Smirnov Tests for Two or Three Samples. Computational Statistics & Data Analysis, 20(2), 185–202. doi: 10.1016/0167-9473(94)00040-P.
shapiro.test
which performs the Shapiro-Wilk test for
normality.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | require("graphics")
x <- rnorm(50)
y <- runif(30)
# Do x and y come from the same distribution?
ks.test(x, y)
# Does x come from a shifted gamma distribution with shape 3 and rate 2?
ks.test(x+2, "pgamma", 3, 2) # two-sided, exact
ks.test(x+2, "pgamma", 3, 2, exact = FALSE)
ks.test(x+2, "pgamma", 3, 2, alternative = "gr")
# test if x is stochastically larger than x2
x2 <- rnorm(50, -1)
plot(ecdf(x), xlim = range(c(x, x2)))
plot(ecdf(x2), add = TRUE, lty = "dashed")
t.test(x, x2, alternative = "g")
wilcox.test(x, x2, alternative = "g")
ks.test(x, x2, alternative = "l")
# with ties, example from Schröer and Trenkler (1995)
# D = 3 / 7, p = 0.2424242
ks.test(c(1, 2, 2, 3, 3), c(1, 2, 3, 3, 4, 5, 6), exact = TRUE)
# formula interface, see ?wilcox.test
kst <- ks.test(Ozone ~ Month, data = airquality,
subset = Month %in% c(5, 8))
# quantile-quantile plot + confidence bands
plot(confband(kst)) # => null hypothesis not plausible,
# shift alternative not plausible
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.