Testing the independence of two nominal or ordered factors.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ```
## S3 method for class 'formula'
chisq_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
chisq_test(object, ...)
## S3 method for class 'IndependenceProblem'
chisq_test(object, ...)
## S3 method for class 'formula'
cmh_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
cmh_test(object, ...)
## S3 method for class 'IndependenceProblem'
cmh_test(object, ...)
## S3 method for class 'formula'
lbl_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
lbl_test(object, ...)
## S3 method for class 'IndependenceProblem'
lbl_test(object, distribution = c("asymptotic", "approximate", "none"), ...)
``` |

`formula` |
a formula of the form |

`data` |
an optional data frame containing the variables in the model formula. |

`subset` |
an optional vector specifying a subset of observations to be used. Defaults
to |

`weights` |
an optional formula of the form |

`object` |
an object inheriting from classes |

`distribution` |
a character, the conditional null distribution of the test statistic can be
approximated by its asymptotic distribution ( |

`...` |
further arguments to be passed to |

`chisq_test`

, `cmh_test`

and `lbl_test`

provide the Pearson
chi-squared test, the generalized Cochran-Mantel-Haenszel test and the
linear-by-linear association test. A general description of these methods is
given by Agresti (2002).

The null hypothesis of independence, or conditional independence given
`block`

, between `y`

and `x`

is tested.

If `y`

and/or `x`

are ordered factors, the default scores,
`1:nlevels(y)`

and `1:nlevels(x)`

respectively, can be altered using
the `scores`

argument (see `independence_test`

); this
argument can also be used to coerce nominal factors to class `"ordered"`

.
(`lbl_test`

coerces to class `"ordered"`

under any circumstances.)
If both `y`

and `x`

are ordered factors, a linear-by-linear
association test is computed and the direction of the alternative hypothesis
can be specified using the `alternative`

argument. For the Pearson
chi-squared test, this extension was given by Yates (1948) who also discussed
the situation when either the response or the covariate is an ordered factor;
see also Cochran (1954) and Armitage (1955) for the particular case when
`y`

is a binary factor and `x`

is ordered. The Mantel-Haenszel
statistic was similarly extended by Mantel (1963) and Landis, Heyman and Koch
(1978).

The conditional null distribution of the test statistic is used to obtain
*p*-values and an asymptotic approximation of the exact distribution is
used by default (`distribution = "asymptotic"`

). Alternatively, the
distribution can be approximated via Monte Carlo resampling or computed
exactly for univariate two-sample problems by setting `distribution`

to
`"approximate"`

or `"exact"`

respectively. See
`asymptotic`

, `approximate`

and `exact`

for details.

An object inheriting from class `"IndependenceTest"`

.

The exact versions of the Pearson chi-squared test and the generalized
Cochran-Mantel-Haenszel test do not necessarily result in the same
*p*-value as Fisher's exact test (Davis, 1986).

Agresti, A. (2002). *Categorical Data Analysis*, Second Edition.
Hoboken, New Jersey: John Wiley & Sons.

Armitage, P. (1955). Tests for linear trends in proportions and frequencies.
*Biometrics* **11**(3), 375–386.

Cochran, W.G. (1954). Some methods for strengthening the common *χ^2*
tests. *Biometrics* **10**(4), 417–451.

Davis, L. J. (1986). Exact tests for *2 x 2* contingency
tables. *The American Statistician* **40**(2), 139–141.

Landis, J. R., Heyman, E. R. and Koch, G. G. (1978). Average partial
association in three-way contingency tables: a review and discussion of
alternative tests. *International Statistical Review* **46**(3),
237–254.

Mantel, N. (1963). Chi-square tests with one degree of freedom: extensions
of the Mantel-Haenszel procedure. *Journal of the American Statistical
Association* **58**(303), 690–700.

Yates, F. (1948). The analysis of contingency tables with groupings based on
quantitative characters. *Biometrika* **35**(1/2), 176–181.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | ```
## Example data
## Davis (1986, p. 140)
davis <- matrix(
c(3, 6,
2, 19),
nrow = 2, byrow = TRUE
)
davis <- as.table(davis)
## Asymptotic Pearson chi-squared test
chisq_test(davis)
## Approximative (Monte Carlo) Pearson chi-squared test
ct <- chisq_test(davis,
distribution = approximate(B = 10000))
pvalue(ct) # standard p-value
midpvalue(ct) # mid-p-value
pvalue_interval(ct) # p-value interval
## Exact Pearson chi-squared test (Davis, 1986)
## Note: disagrees with Fisher's exact test
ct <- chisq_test(davis,
distribution = "exact")
pvalue(ct) # standard p-value
midpvalue(ct) # mid-p-value
pvalue_interval(ct) # p-value interval
fisher.test(davis)
## Laryngeal cancer data
## Agresti (2002, p. 107, Tab. 3.13)
cancer <- matrix(
c(21, 2,
15, 3),
nrow = 2, byrow = TRUE,
dimnames = list(
"Treatment" = c("Surgery", "Radiation"),
"Cancer" = c("Controlled", "Not Controlled")
)
)
cancer <- as.table(cancer)
## Exact Pearson chi-squared test (Agresti, 2002, p. 108, Tab. 3.14)
## Note: agrees with Fishers's exact test
(ct <- chisq_test(cancer,
distribution = "exact"))
midpvalue(ct) # mid-p-value
pvalue_interval(ct) # p-value interval
fisher.test(cancer)
## Homework conditions and teacher's rating
## Yates (1948, Tab. 1)
yates <- matrix(
c(141, 67, 114, 79, 39,
131, 66, 143, 72, 35,
36, 14, 38, 28, 16),
byrow = TRUE, ncol = 5,
dimnames = list(
"Rating" = c("A", "B", "C"),
"Condition" = c("A", "B", "C", "D", "E")
)
)
yates <- as.table(yates)
## Asymptotic Pearson chi-squared test (Yates, 1948, p. 176)
chisq_test(yates)
## Asymptotic Pearson-Yates chi-squared test (Yates, 1948, pp. 180-181)
## Note: 'Rating' and 'Condition' as ordinal
(ct <- chisq_test(yates,
alternative = "less",
scores = list("Rating" = c(-1, 0, 1),
"Condition" = c(2, 1, 0, -1, -2))))
statistic(ct)^2 # chi^2 = 2.332
## Asymptotic Pearson-Yates chi-squared test (Yates, 1948, p. 181)
## Note: 'Rating' as ordinal
chisq_test(yates,
scores = list("Rating" = c(-1, 0, 1))) # Q = 3.825
## Change in clinical condition and degree of infiltration
## Cochran (1954, Tab. 6)
cochran <- matrix(
c(11, 7,
27, 15,
42, 16,
53, 13,
11, 1),
byrow = TRUE, ncol = 2,
dimnames = list(
"Change" = c("Marked", "Moderate", "Slight",
"Stationary", "Worse"),
"Infiltration" = c("0-7", "8-15")
)
)
cochran <- as.table(cochran)
## Asymptotic Pearson chi-squared test (Cochran, 1954, p. 435)
chisq_test(cochran) # X^2 = 6.88
## Asymptotic Cochran-Armitage test (Cochran, 1954, p. 436)
## Note: 'Change' as ordinal
(ct <- chisq_test(cochran,
scores = list("Change" = c(3, 2, 1, 0, -1))))
statistic(ct)^2 # X^2 = 6.66
## Change in size of ulcer crater for two treatment groups
## Armitage (1955, Tab. 2)
armitage <- matrix(
c( 6, 4, 10, 12,
11, 8, 8, 5),
byrow = TRUE, ncol = 4,
dimnames = list(
"Treatment" = c("A", "B"),
"Crater" = c("Larger", "< 2/3 healed",
"=> 2/3 healed", "Healed")
)
)
armitage <- as.table(armitage)
## Approximative (Monte Carlo) Pearson chi-squared test (Armitage, 1955, p. 379)
chisq_test(armitage,
distribution = approximate(B = 10000)) # chi^2 = 5.91
## Approximative (Monte Carlo) Cochran-Armitage test (Armitage, 1955, p. 379)
(ct <- chisq_test(armitage,
distribution = approximate(B = 10000),
scores = list("Crater" = c(-1.5, -0.5, 0.5, 1.5))))
statistic(ct)^2 # chi_0^2 = 5.26
## Relationship between job satisfaction and income stratified by gender
## Agresti (2002, p. 288, Tab. 7.8)
## Asymptotic generalized Cochran-Mantel-Haenszel test (Agresti, p. 297)
cmh_test(jobsatisfaction) # CMH = 10.2001
## Asymptotic generalized Cochran-Mantel-Haenszel test (Agresti, p. 297)
## Note: 'Job.Satisfaction' as ordinal
cmh_test(jobsatisfaction,
scores = list("Job.Satisfaction" = c(1, 3, 4, 5))) # L^2 = 9.0342
## Asymptotic linear-by-linear association test (Agresti, p. 297)
## Note: 'Job.Satisfaction' and 'Income' as ordinal
(lt <- lbl_test(jobsatisfaction,
scores = list("Job.Satisfaction" = c(1, 3, 4, 5),
"Income" = c(3, 10, 20, 35))))
statistic(lt)^2 # M^2 = 6.1563
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.