chisq.test2: Chi-squared goodness-of-fit with intrinsic null hypothesis

Description Usage Arguments Details Examples

View source: R/chisq-test2.R

Description

This version of the chi-squared test for goodness-of-fit allows one to perform intrinsic null hypothesis specifying the degree of freedom of the X-squared distribution.

Usage

1
chisq.test2(x, p, n_est, df, rescale.p = FALSE, ...)

Arguments

x

A numeric vector. Observed values.

p

A vector of probabilities of the same length of x.

n_est

Number of estimated parameters. Not yet implemented.

df

Degree of freedom.

...

Extra parameters to be passed to chisq.test.

Details

Under the usual extrinsic null hypothesis where the expected numbers are known before collecting data, and the degree of freedom corresponds to the number of classes minus 1. Indeed,...

However under a intrinsic null hypothesis, one or more parameters are estimated from the data collected to estimate subsequently the expected numbers. The degree of freedom needs to take that into account, and so it is equal to the number of classes minus the number of assessed parameters minus 1.

If df is not given, the regular chisq.test is called with all the same parameters and rescale.p = TRUE.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
set.seed(12345)
# Here we know the expected mean number:
p <- 0.3
n <- 10
N <- 30
obs <- rbinom(n = N, size = n, prob = p)
freq <- as.data.frame(table(obs))
names(freq) <- c("category","observed")
freq <- merge(x = data.frame(category = 0:n),
              y = freq,
              by = "category", all = TRUE)
freq[is.na(freq)] <- 0
freq[] <- lapply(freq, as.numeric) # Force each column to be numeric (and only numeric!)
freq$expected <- dbinom(x = 0:10, size = n, prob = p) * N
# Test
test1 <- chisq.test2(freq$observed, p = freq$expected, rescale.p = T)
test2 <- chisq.test(freq$observed, p = freq$expected, rescale.p = T)
identical(test1, test2)
test1

# If we assess one parameter from the observed data set:
p_est <- mean(freq$observed) / n
n_est <- 1
test1 <- chisq.test2(freq$observed, p = freq$expected, n_est = n_est,
                     rescale.p = T)
test2 <- chisq.test2(freq$observed, p = freq$expected,
                     df = length(freq$observed) - n_est - 1, rescale.p = T)
identical(test1, test2)
test1

chgigot/cgmisc documentation built on May 14, 2019, 8:17 a.m.