ScaleTests: Two- and K-Sample Scale Tests

ScaleTestsR Documentation

Two- and K-Sample Scale Tests

Description

Testing the equality of the distributions of a numeric response variable in two or more independent groups against scale alternatives.

Usage

## S3 method for class 'formula'
taha_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'IndependenceProblem'
taha_test(object, conf.int = FALSE, conf.level = 0.95, ...)

## S3 method for class 'formula'
klotz_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'IndependenceProblem'
klotz_test(object, ties.method = c("mid-ranks", "average-scores"),
           conf.int = FALSE, conf.level = 0.95, ...)

## S3 method for class 'formula'
mood_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'IndependenceProblem'
mood_test(object, ties.method = c("mid-ranks", "average-scores"),
          conf.int = FALSE, conf.level = 0.95, ...)

## S3 method for class 'formula'
ansari_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'IndependenceProblem'
ansari_test(object, ties.method = c("mid-ranks", "average-scores"),
            conf.int = FALSE, conf.level = 0.95, ...)

## S3 method for class 'formula'
fligner_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'IndependenceProblem'
fligner_test(object, ties.method = c("mid-ranks", "average-scores"),
             conf.int = FALSE, conf.level = 0.95, ...)

## S3 method for class 'formula'
conover_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'IndependenceProblem'
conover_test(object, conf.int = FALSE, conf.level = 0.95, ...)

Arguments

formula

a formula of the form y ~ x | block where y is a numeric variable, x is a factor and block is an optional factor for stratification.

data

an optional data frame containing the variables in the model formula.

subset

an optional vector specifying a subset of observations to be used. Defaults to NULL.

weights

an optional formula of the form ~ w defining integer valued case weights for each observation. Defaults to NULL, implying equal weight for all observations.

object

an object inheriting from class "IndependenceProblem".

conf.int

a logical indicating whether a confidence interval for the ratio of scales should be computed. Defaults to FALSE.

conf.level

a numeric, confidence level of the interval. Defaults to 0.95.

ties.method

a character, the method used to handle ties: the score generating function either uses mid-ranks ("mid-ranks", default) or averages the scores of randomly broken ties ("average-scores").

...

further arguments to be passed to independence_test().

Details

taha_test(), klotz_test(), mood_test(), ansari_test(), fligner_test() and conover_test() provide the Taha test, the Klotz test, the Mood test, the Ansari-Bradley test, the Fligner-Killeen test and the Conover-Iman test. A general description of these methods is given by \bibcitetcoin::hollanderwolfe1999.For the adjustment of scores for tied values see \bibcitet|coin::theory-of-:-1999|pp. 133–135.

The null hypothesis of equality, or conditional equality given block, of the distribution of y in the groups defined by x is tested against scale alternatives. In the two-sample case, the two-sided null hypothesis is H_0\!: V(Y_1) / V(Y_2) = 1, where V(Y_s) is the variance of the responses in the sth sample. In case alternative = "less", the null hypothesis is H_0\!: V(Y_1) / V(Y_2) \ge 1. When alternative = "greater", the null hypothesis is H_0\!: V(Y_1) / V(Y_2) \le 1. Confidence intervals for the ratio of scales are available and computed according to \bibcitetcoin::bauer_1972.

The Fligner-Killeen test uses median centering in each of the samples, as suggested by \bibcitetcoin::conover_1981, whereas the Conover-Iman test, following \bibcitetcoin::conover_1978, uses mean centering in each of the samples.

The conditional null distribution of the test statistic is used to obtain p-values and an asymptotic approximation of the exact distribution is used by default (distribution = "asymptotic"). Alternatively, the distribution can be approximated via Monte Carlo resampling or computed exactly for univariate two-sample problems by setting distribution to "approximate" or "exact", respectively. See asymptotic(), approximate() and exact() for details.

The example section uses data from \bibcitetcoin::hollanderwolfe1999.

Value

An object inheriting from class "IndependenceTest". Confidence intervals can be extracted by confint().

Note

In the two-sample case, a large value of the Ansari-Bradley statistic indicates that sample 1 is less variable than sample 2, whereas a large value of the statistics due to Taha, Klotz, Mood, Fligner-Killeen, and Conover-Iman indicate that sample 1 is more variable than sample 2.

References

\bibshow

*

Examples

## Serum Iron Determination Using Hyland Control Sera
## Hollander and Wolfe (1999, p. 147, Tab 5.1)
sid <- data.frame(
    serum = c(111, 107, 100, 99, 102, 106, 109, 108, 104, 99,
              101, 96, 97, 102, 107, 113, 116, 113, 110, 98,
              107, 108, 106, 98, 105, 103, 110, 105, 104,
              100, 96, 108, 103, 104, 114, 114, 113, 108, 106, 99),
    method = gl(2, 20, labels = c("Ramsay", "Jung-Parekh"))
)

## Asymptotic Ansari-Bradley test
ansari_test(serum ~ method, data = sid)

## Exact Ansari-Bradley test
pvalue(ansari_test(serum ~ method, data = sid,
                   distribution = "exact"))


## Platelet Counts of Newborn Infants
## Hollander and Wolfe (1999, p. 171, Tab. 5.4)
platelet <- data.frame(
    counts = c(120, 124, 215, 90, 67, 95, 190, 180, 135, 399,
               12, 20, 112, 32, 60, 40),
    treatment = factor(rep(c("Prednisone", "Control"), c(10, 6)))
)

## Approximative (Monte Carlo) Lepage test
## Hollander and Wolfe (1999, p. 172)
lepage_trafo <- function(y)
    cbind("Location" = rank_trafo(y), "Scale" = ansari_trafo(y))

independence_test(counts ~ treatment, data = platelet,
                  distribution = approximate(nresample = 10000),
                  ytrafo = function(data)
                      trafo(data, numeric_trafo = lepage_trafo),
                  teststat = "quadratic")

## Why was the null hypothesis rejected?
## Note: maximum statistic instead of quadratic form
ltm <- independence_test(counts ~ treatment, data = platelet,
                         distribution = approximate(nresample = 10000),
                         ytrafo = function(data)
                             trafo(data, numeric_trafo = lepage_trafo))

## Step-down adjustment suggests a difference in location
pvalue(ltm, method = "step-down")

## The same results are obtained from the simple Sidak-Holm procedure since the
## correlation between Wilcoxon and Ansari-Bradley test statistics is zero
cov2cor(covariance(ltm))
pvalue(ltm, method = "step-down", distribution = "marginal", type = "Sidak")

coin documentation built on June 30, 2026, 9:06 a.m.