Description Usage Arguments Details Value Note References Examples
Testing the independence of two nominal or ordered factors.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ## S3 method for class 'formula'
chisq_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
chisq_test(object, ...)
## S3 method for class 'IndependenceProblem'
chisq_test(object, ...)
## S3 method for class 'formula'
cmh_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
cmh_test(object, ...)
## S3 method for class 'IndependenceProblem'
cmh_test(object, ...)
## S3 method for class 'formula'
lbl_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
lbl_test(object, ...)
## S3 method for class 'IndependenceProblem'
lbl_test(object, ...)
|
formula |
a formula of the form |
data |
an optional data frame containing the variables in the model formula. |
subset |
an optional vector specifying a subset of observations to be used. Defaults
to |
weights |
an optional formula of the form |
object |
an object inheriting from classes |
... |
further arguments to be passed to |
chisq_test
, cmh_test
and lbl_test
provide the Pearson
chi-squared test, the generalized Cochran-Mantel-Haenszel test and the
linear-by-linear association test. A general description of these methods is
given by Agresti (2002).
The null hypothesis of independence, or conditional independence given
block
, between y
and x
is tested.
If y
and/or x
are ordered factors, the default scores,
1:nlevels(y)
and 1:nlevels(x)
respectively, can be altered using
the scores
argument (see independence_test
); this
argument can also be used to coerce nominal factors to class "ordered"
.
(lbl_test
coerces to class "ordered"
under any circumstances.)
If both y
and x
are ordered factors, a linear-by-linear
association test is computed and the direction of the alternative hypothesis
can be specified using the alternative
argument. For the Pearson
chi-squared test, this extension was given by Yates (1948) who also discussed
the situation when either the response or the covariate is an ordered factor;
see also Cochran (1954) and Armitage (1955) for the particular case when
y
is a binary factor and x
is ordered. The Mantel-Haenszel
statistic (Mantel and Haenszel, 1959) was similarly extended by Mantel (1963)
and Landis, Heyman and Koch (1978).
The conditional null distribution of the test statistic is used to obtain
p-values and an asymptotic approximation of the exact distribution is
used by default (distribution = "asymptotic"
). Alternatively, the
distribution can be approximated via Monte Carlo resampling or computed
exactly for univariate two-sample problems by setting distribution
to
"approximate"
or "exact"
respectively. See
asymptotic
, approximate
and exact
for details.
An object inheriting from class "IndependenceTest"
.
The exact versions of the Pearson chi-squared test and the generalized Cochran-Mantel-Haenszel test do not necessarily result in the same p-value as Fisher's exact test (Davis, 1986).
Agresti, A. (2002). Categorical Data Analysis, Second Edition. Hoboken, New Jersey: John Wiley & Sons.
Armitage, P. (1955). Tests for linear trends in proportions and frequencies. Biometrics 11(3), 375–386. doi: 10.2307/3001775
Cochran, W.G. (1954). Some methods for strengthening the common χ^2 tests. Biometrics 10(4), 417–451. doi: 10.2307/3001616
Davis, L. J. (1986). Exact tests for 2 x 2 contingency tables. The American Statistician 40(2), 139–141. doi: 10.1080/00031305.1986.10475377
Landis, J. R., Heyman, E. R. and Koch, G. G. (1978). Average partial association in three-way contingency tables: a review and discussion of alternative tests. International Statistical Review 46(3), 237–254. doi: 10.2307/1402373
Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute 22(4), 719–748. doi: 10.1093/jnci/22.4.719
Mantel, N. (1963). Chi-square tests with one degree of freedom: extensions of the Mantel-Haenszel procedure. Journal of the American Statistical Association 58(303), 690–700. doi: 10.1080/01621459.1963.10500879
Yates, F. (1948). The analysis of contingency tables with groupings based on quantitative characters. Biometrika 35(1/2), 176–181. doi: 10.1093/biomet/35.1-2.176
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | ## Example data
## Davis (1986, p. 140)
davis <- matrix(
c(3, 6,
2, 19),
nrow = 2, byrow = TRUE
)
davis <- as.table(davis)
## Asymptotic Pearson chi-squared test
chisq_test(davis)
chisq.test(davis, correct = FALSE) # same as above
## Approximative (Monte Carlo) Pearson chi-squared test
ct <- chisq_test(davis,
distribution = approximate(nresample = 10000))
pvalue(ct) # standard p-value
midpvalue(ct) # mid-p-value
pvalue_interval(ct) # p-value interval
size(ct, alpha = 0.05) # test size at alpha = 0.05 using the p-value
## Exact Pearson chi-squared test (Davis, 1986)
## Note: disagrees with Fisher's exact test
ct <- chisq_test(davis,
distribution = "exact")
pvalue(ct) # standard p-value
midpvalue(ct) # mid-p-value
pvalue_interval(ct) # p-value interval
size(ct, alpha = 0.05) # test size at alpha = 0.05 using the p-value
fisher.test(davis)
## Laryngeal cancer data
## Agresti (2002, p. 107, Tab. 3.13)
cancer <- matrix(
c(21, 2,
15, 3),
nrow = 2, byrow = TRUE,
dimnames = list(
"Treatment" = c("Surgery", "Radiation"),
"Cancer" = c("Controlled", "Not Controlled")
)
)
cancer <- as.table(cancer)
## Exact Pearson chi-squared test (Agresti, 2002, p. 108, Tab. 3.14)
## Note: agrees with Fishers's exact test
(ct <- chisq_test(cancer,
distribution = "exact"))
midpvalue(ct) # mid-p-value
pvalue_interval(ct) # p-value interval
size(ct, alpha = 0.05) # test size at alpha = 0.05 using the p-value
fisher.test(cancer)
## Homework conditions and teacher's rating
## Yates (1948, Tab. 1)
yates <- matrix(
c(141, 67, 114, 79, 39,
131, 66, 143, 72, 35,
36, 14, 38, 28, 16),
byrow = TRUE, ncol = 5,
dimnames = list(
"Rating" = c("A", "B", "C"),
"Condition" = c("A", "B", "C", "D", "E")
)
)
yates <- as.table(yates)
## Asymptotic Pearson chi-squared test (Yates, 1948, p. 176)
chisq_test(yates)
## Asymptotic Pearson-Yates chi-squared test (Yates, 1948, pp. 180-181)
## Note: 'Rating' and 'Condition' as ordinal
(ct <- chisq_test(yates,
alternative = "less",
scores = list("Rating" = c(-1, 0, 1),
"Condition" = c(2, 1, 0, -1, -2))))
statistic(ct)^2 # chi^2 = 2.332
## Asymptotic Pearson-Yates chi-squared test (Yates, 1948, p. 181)
## Note: 'Rating' as ordinal
chisq_test(yates,
scores = list("Rating" = c(-1, 0, 1))) # Q = 3.825
## Change in clinical condition and degree of infiltration
## Cochran (1954, Tab. 6)
cochran <- matrix(
c(11, 7,
27, 15,
42, 16,
53, 13,
11, 1),
byrow = TRUE, ncol = 2,
dimnames = list(
"Change" = c("Marked", "Moderate", "Slight",
"Stationary", "Worse"),
"Infiltration" = c("0-7", "8-15")
)
)
cochran <- as.table(cochran)
## Asymptotic Pearson chi-squared test (Cochran, 1954, p. 435)
chisq_test(cochran) # X^2 = 6.88
## Asymptotic Cochran-Armitage test (Cochran, 1954, p. 436)
## Note: 'Change' as ordinal
(ct <- chisq_test(cochran,
scores = list("Change" = c(3, 2, 1, 0, -1))))
statistic(ct)^2 # X^2 = 6.66
## Change in size of ulcer crater for two treatment groups
## Armitage (1955, Tab. 2)
armitage <- matrix(
c( 6, 4, 10, 12,
11, 8, 8, 5),
byrow = TRUE, ncol = 4,
dimnames = list(
"Treatment" = c("A", "B"),
"Crater" = c("Larger", "< 2/3 healed",
">= 2/3 healed", "Healed")
)
)
armitage <- as.table(armitage)
## Approximative (Monte Carlo) Pearson chi-squared test (Armitage, 1955, p. 379)
chisq_test(armitage,
distribution = approximate(nresample = 10000)) # chi^2 = 5.91
## Approximative (Monte Carlo) Cochran-Armitage test (Armitage, 1955, p. 379)
(ct <- chisq_test(armitage,
distribution = approximate(nresample = 10000),
scores = list("Crater" = c(-1.5, -0.5, 0.5, 1.5))))
statistic(ct)^2 # chi_0^2 = 5.26
## Relationship between job satisfaction and income stratified by gender
## Agresti (2002, p. 288, Tab. 7.8)
## Asymptotic generalized Cochran-Mantel-Haenszel test (Agresti, p. 297)
(ct <- cmh_test(jobsatisfaction)) # CMH = 10.2001
## The standardized linear statistic
statistic(ct, type = "standardized")
## The standardized linear statistic for each block
statistic(ct, type = "standardized", partial = TRUE)
## Asymptotic generalized Cochran-Mantel-Haenszel test (Agresti, p. 297)
## Note: 'Job.Satisfaction' as ordinal
cmh_test(jobsatisfaction,
scores = list("Job.Satisfaction" = c(1, 3, 4, 5))) # L^2 = 9.0342
## Asymptotic linear-by-linear association test (Agresti, p. 297)
## Note: 'Job.Satisfaction' and 'Income' as ordinal
(lt <- lbl_test(jobsatisfaction,
scores = list("Job.Satisfaction" = c(1, 3, 4, 5),
"Income" = c(3, 10, 20, 35))))
statistic(lt)^2 # M^2 = 6.1563
## The standardized linear statistic
statistic(lt, type = "standardized")
## The standardized linear statistic for each block
statistic(lt, type = "standardized", partial = TRUE)
|
Loading required package: survival
Asymptotic Pearson Chi-Squared Test
data: Var2 by Var1 (A, B)
chi-squared = 2.5714, df = 1, p-value = 0.1088
Pearson's Chi-squared test
data: davis
X-squared = 2.5714, df = 1, p-value = 0.1088
Warning message:
In chisq.test(davis, correct = FALSE) :
Chi-squared approximation may be incorrect
[1] 0.2884
99 percent confidence interval:
0.2767843 0.3002182
[1] 0.15385
99 percent confidence interval:
0.1446722 0.1632571
p_0 p_1
0.0193 0.2884
[1] 0.0193
[1] 0.2860301
[1] 0.1527409
p_0 p_1
0.01945181 0.28603006
[1] 0.01945181
Fisher's Exact Test for Count Data
data: davis
p-value = 0.1432
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.410317 65.723900
sample estimates:
odds ratio
4.462735
Exact Pearson Chi-Squared Test
data: Cancer by Treatment (Surgery, Radiation)
chi-squared = 0.59915, p-value = 0.6384
[1] 0.5006832
p_0 p_1
0.3629407 0.6384258
[1] 0.01143318
Fisher's Exact Test for Count Data
data: cancer
p-value = 0.6384
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.2089115 27.5538747
sample estimates:
odds ratio
2.061731
Asymptotic Pearson Chi-Squared Test
data: Condition by Rating (A, B, C)
chi-squared = 9.0928, df = 8, p-value = 0.3345
Asymptotic Linear-by-Linear Association Test
data: Condition (ordered) by Rating (A < B < C)
Z = -1.5269, p-value = 0.06339
alternative hypothesis: less
[1] 2.33154
Asymptotic Generalized Pearson Chi-Squared Test
data: Condition by Rating (A < B < C)
chi-squared = 3.8242, df = 4, p-value = 0.4303
Asymptotic Pearson Chi-Squared Test
data: Infiltration by
Change (Marked, Moderate, Slight, Stationary, Worse)
chi-squared = 6.8807, df = 4, p-value = 0.1423
Asymptotic Linear-by-Linear Association Test
data: Infiltration by
Change (Marked < Moderate < Slight < Stationary < Worse)
Z = -2.5818, p-value = 0.009829
alternative hypothesis: two.sided
[1] 6.665691
Approximative Pearson Chi-Squared Test
data: Crater by Treatment (A, B)
chi-squared = 5.9085, p-value = 0.1187
Approximative Linear-by-Linear Association Test
data: Crater (ordered) by Treatment (A, B)
Z = 2.2932, p-value = 0.0313
alternative hypothesis: two.sided
[1] 5.258804
Asymptotic Generalized Cochran-Mantel-Haenszel Test
data: Job.Satisfaction by
Income (<5000, 5000-15000, 15000-25000, >25000)
stratified by Gender
chi-squared = 10.2, df = 9, p-value = 0.3345
Very Dissatisfied A Little Satisfied Moderately Satisfied
<5000 1.3112789 0.69201053 -0.2478705
5000-15000 0.6481783 0.83462550 0.5175755
15000-25000 -1.0958361 -1.50130926 0.2361231
>25000 -1.0377629 -0.08983052 -0.5946119
Very Satisfied
<5000 -0.9293458
5000-15000 -1.6257547
15000-25000 1.4614123
>25000 1.2031648
Very Dissatisfied A Little Satisfied Moderately Satisfied
<5000 1.3112789 0.69201053 -0.2478705
5000-15000 0.6481783 0.83462550 0.5175755
15000-25000 -1.0958361 -1.50130926 0.2361231
>25000 -1.0377629 -0.08983052 -0.5946119
Very Satisfied
<5000 -0.9293458
5000-15000 -1.6257547
15000-25000 1.4614123
>25000 1.2031648
Asymptotic Generalized Cochran-Mantel-Haenszel Test
data: Job.Satisfaction (ordered) by
Income (<5000, 5000-15000, 15000-25000, >25000)
stratified by Gender
chi-squared = 9.0342, df = 3, p-value = 0.02884
Asymptotic Linear-by-Linear Association Test
data: Job.Satisfaction (ordered) by
Income (<5000 < 5000-15000 < 15000-25000 < >25000)
stratified by Gender
Z = 2.4812, p-value = 0.01309
alternative hypothesis: two.sided
[1] 6.156301
2.48119
2.48119
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.