fun.chisq.test | R Documentation |
Asymptotic chi-squared, normalized chi-squared or exact tests on contingency tables to determine model-free functional dependency of the column variable on the row variable.
fun.chisq.test(
x,
method = c("fchisq", "nfchisq", "adapted",
"exact", "exact.qp", "exact.dp", "exact.dqp",
"default", "normalized", "simulate.p.value"),
alternative = c("non-constant", "all"), log.p=FALSE,
index.kind = c("conditional", "unconditional"),
simulate.nruns = 2000,
exact.mode.bound=TRUE
)
x |
a matrix representing a contingency table. The row variable represents the independent variable or all unique combinations of multiple independent variables. The column variable is the dependent variable. |
method |
a character string to specify the method to compute the functional chi-squared test statistic and its p-value. The options are Note: |
alternative |
a character string to specify the alternative hypothesis. The options are |
log.p |
logical; if |
index.kind |
a character string to specify the kind of function index xi.f to be estimated. The options are |
simulate.nruns |
A number to specify the number of tables generated to simulate the null distribution. Default is |
exact.mode.bound |
logical; if |
The functional chi-squared test determines whether the column variable is a function of the row variable in contingency table x
\insertCitezhang2013deciphering,zhang2014nonparametricFunChisq. This function supports three hypothesis testing methods:
When method="fchisq"
(equivalent to "default"
, the default), the test statistic is computed as described in \insertCitezhang2013deciphering,zhang2014nonparametricFunChisq and the p-value is computed using the chi-squared distribution.
When method="nfchisq"
(equivalent to "normalized"
), the test statistic is obtained by shifting and scaling the original test statistic \insertCitezhang2013deciphering,zhang2014nonparametricFunChisq; and the p-value is computed using the standard normal distribution \insertCiteBox2005FunChisq. The normalized chi-squared, more conservative on the degrees of freedom, was used by the Best Performer NMSUSongLab in HPN-DREAM (DREAM8) Breast Cancer Network Inference Challenges.
When method="exact"
, "exact.qp"
(quadratic programming) \insertCitezhong2019eft,zhong2019modelfreeFunChisq, "exact.dp"
(dynamic programming) \insertCitenguyen2018modelfree,Nguyen2020EFTFunChisq, or "exact.dqp"
(dynamic and quadratic programming) \insertCitenguyen2018modelfree,Nguyen2020EFTFunChisq, an exact functional test is performed. The option of "exact"
uses "exact.dqp"
, the fastest method. All methods compute an exact p-value.
When method="adapted"
, the adapted functional chi-squared test \insertCiteKumar2022AFTFunChisq is used. The test statistic is obtained by evaluating the most populous portrait or square (number of rows <= number of columns) table in the contingency table x
. The p-value is computed using the chi-squared distribution. This option should be used to determine the functional direction between variables in x
.
For the "exact.qp"
and "exact.dp"
options, if the sample size is no more than 200 or the average cell count is less than five, and the table size is no more than 10 in either row or column, the exact test will not be called and the asymptotic functional chi-squared test (method="fchisq"
) is used instead.
For "exact.dqp"
, the exact functional test will always be performed.
For 2-by-2 contingency tables, the asymptotic test options (method="fchisq"
or "nfchisq"
) are recommended to test functional dependency, instead of the exact functional test.
When method="simulate.p.value"
, a simulated null distribution is used to calculate p-value
. The null distribution is a multinomial distribution that is the product of two marginal distributions. Like other Monte Carlo based methods, this method is slower but may be more accurate than other methods based on asymptotic distributions.
index.kind
specifies the kind of function index to be computed. If the experimental design controls neither the row nor column marginal sums, index.kind = "unconditional"
is recommended; If the column marginal sums are controlled, index.kind = "conditional"
is recommended. The conditional
function index is the square root of Goodman-Kruskal's tau \insertCitegoodman1954measuresFunChisq. The choice of index.kind
affects only the function index xi.f value, but not the test statistic or p-value.
A list with class "htest
" containing the following components:
statistic |
the functional chi-squared statistic if |
parameter |
degrees of freedom for the functional chi-squared statistic. |
p.value |
p-value of the functional test. If |
estimate |
an estimate of function index between 0 and 1. The value of 1 indicates a strictly mathematical function. It is asymmetrical with respect to transpose of the input contingency table, different from the symmetrical Cramer's V based on the Pearson's chi-squared test statistic. See \insertCiteZhong2019FANTOM5,KumarZSLS18FunChisq for the definition of function index. |
Yang Zhang, Hua Zhong, Hien Nguyen, Sajal Kumar, and Joe Song
For data discretization, an option is optimal univariate clustering via package Ckmeans.1d.dp. A second option is joint multivariate discretization via package GridOnClusters.
For symmetrical dependency tests on discrete data, see Pearson's chi-squared test chisq.test
, Fisher's exact test fisher.test
, and mutual information methods in package entropy.
# Example 1. Asymptotic functional chi-squared test
x <- matrix(c(20,0,20,0,20,0,5,0,5), 3)
fun.chisq.test(x) # strong functional dependency
fun.chisq.test(t(x)) # weak functional dependency
# Example 2. Normalized functional chi-squared test
x <- matrix(c(8,0,8,0,8,0,2,0,2), 3)
fun.chisq.test(x, method="nfchisq") # strong functional dependency
fun.chisq.test(t(x), method="nfchisq") # weak functional dependency
# Example 3. Exact functional chi-squared test
x <- matrix(c(4,0,4,0,4,0,1,0,1), 3)
fun.chisq.test(x, method="exact") # strong functional dependency
fun.chisq.test(t(x), method="exact") # weak functional dependency
# Example 4. Exact functional chi-squared test on a real data set
# (Shen et al., 2002)
# x is a contingency table with row variable for p53 mutation and
# column variable for CIMP
x <- matrix(c(12,26,18,0,8,12), nrow=2, ncol=3, byrow=TRUE)
# Example 5. Adpated functional chi-squared test
x <- matrix(c(20, 0, 1, 0, 1, 20, 3, 2, 15, 2, 5, 2), 3, 4, byrow=TRUE)
fun.chisq.test(x, method="adapted") # strong functional dependency
fun.chisq.test(t(x), method="adapted") # weak functional dependency
# Test the functional dependency: p53 mutation -> CIMP
fun.chisq.test(x, method="exact")
# Test the functional dependency CIMP -> p53 mutation
fun.chisq.test(t(x), method="exact")
# Example 6. Asymptotic functional chi-squared test with simulated distribution
x <- matrix(c(20,0,20,0,20,0,5,0,5), 3)
fun.chisq.test(x, method="simulate.p.value")
fun.chisq.test(x, method="simulate.p.value", simulate.n = 1000)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.