knitr::opts_chunk$set( collapse = TRUE, comment = "#ws>" )
library(Intro2MLR)
The use of categorical variables is an important part of MLR and regression in general. Therefore we must learn about them and the multinomial.
data(irr) irr irr$FREQ -> opinion names(opinion) <- irr$STRATEGY opinion <- as.table(opinion) opinion
Suppose we wish to test the NULL hypothesis
$$H_0:p_i = 1/5,\;\;\forall i\in{1,\ldots,5}$$
test <-chisq.test(opinion) test
We do not have sufficient evidence at the 0.05 level to reject the NULL hypothesis and so we retain it as plausibly true given the data.
You can investigate the $\chi^2$ test and associated statistics further by looking at the output
names(test)
Notice that there are some very useful components that make up the test.
test$expected
These are calculated under the assumption that the NULL hypothesis is correct
We will look at the three mile island example as detailed in the book MS page 453
data(mile3) mile3$NUMBER->freq mat <- matrix(freq, nrow = 2, byrow =FALSE) dimnames(mat) = list(c("yes", "no"), c("1-6","7-12", "13+")) tab <- as.table(mat) addmargins(tab)
We wish to know whether the two directions of classification are dependent.
To see the p-value area we can use ggplot. It helps to know that the mean of a chisquare is $\nu$ and the variance is $2\nu$.
The p value is giving us evidence against the NULL hypothesis of factor independence.
$$H_0: \tt{Attitude} \; \perp \; \tt{Distance}$$
library(ggplot2) out <- chisq.test(tab) chiargs = list(df = out$parameter) g <- ggplot(data.frame(x=c(out$statistic, out$parameter+4*sqrt(2*out$parameter))), aes(x)) + stat_function(fun = dchisq, args = chiargs , geom = "area", fill = "green") g <- g + stat_function(fun = dchisq, args = chiargs, geom = "area", fill = "black",xlim = c(0,out$statistic)) g <- g + xlab("X") + ylab("Density") g out 1-pchisq(out$statistic,2)
To determine what the cut off values are we can create them using qchisq()
qchisq(1-0.05, df = 1:10)
Can we make sense of this? If the number of cells increase so the degrees of freedom will also, since $\nu = (r-1)(c-1)$.
q = qchisq(1-0.05, 1:20) diff(q)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.