knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
In detecting discrete functional patterns model free, the adapted functional chi-squared test (AdpFunChisq) offers a single test statistic to quantify functional strength and direction between two discrete random variables. It promotes functional patterns over non-functional or independent patterns. Extended from the functional chi-squared (FunChisq) test, AdpFunChisq is preferable when prioritizing multiple relationships that are contained in contingency tables of various dimensions.
Here, we illustrate how AdpFunChisq is advantageous over the original FunChisq or conditional entropy in determining functional direction and strength.
If a function is a many-to-one mapping, then its inverse is not a function. In this context, the three methods can behave sharply differently depending on table shape: square (same numbers of rows and columns), landscape (more columns than rows), or portrait (more rows than columns). Here the row variable is the independent variable $X$ and the column variable is the dependent variable $Y$. Each method tests whether $Y$ is a function of $X$ on the original table; then it tests whether $X$ is a function of $Y$ on the transposed table. We use $p$-values for the two FunChisq methods and the conditional entropy $H(Y|X)$. In all three methods, a smaller value indicates a stronger functional direction.
We will show by three examples that only AdpFunChisq performs consistently well on all three table types. Below is a generic function compare.methods
to visualize the input contingency table, perform the tests, and report the correctness of the function direction.
require(FunChisq) require(infotheo) require(DescTools) require(dqrng) # set seeds set.seed(123) dqset.seed(123) # simulate a noisy `many.to.one' function (Y = f(X) and X != f(Y)) tab = simulate_tables(n=1000, nrow=3, ncol=4, type="many.to.one", noise=0.3)$noise.list[[1]] # calculate correct and incorrect direction scores for AdpFunChisq # The P-value of AdpFunChisq is lower the better. correct_afc = fun.chisq.test(tab, method="adapted") incorrect_afc = fun.chisq.test(t(tab), method="adapted") # calculate correct and incorrect direction scores for Conditional Entropy # The score of Conditional Entropy is lower the better. vars = Untable(tab) correct_ce = condentropy( as.numeric(vars[,2]), as.numeric(vars[,1]) ) incorrect_ce = condentropy( as.numeric(vars[,1]), as.numeric(vars[,2]) ) # plot table with AdpFunChisq and Conditional Entropy Scores # correct direction plot_table(tab, xlab = 'Y', ylab = 'X', value.cex = 0.9, cex.main=0.9, col="seagreen", main=paste0("AdpFunChisq P = ",format(correct_afc$p.value, digits = 2), "\nConditional Entropy = ",format(correct_ce, digits = 2))) # incorrect direction plot_table(t(tab), xlab = 'X', ylab = 'Y', value.cex = 0.9, cex.main=0.9, col="firebrick1", main=paste0("AdpFunChisq P = ",format(incorrect_afc$p.value, digits = 2), "\nConditional Entropy = ",format(incorrect_ce, digits = 2)))
require(FunChisq) require(infotheo) require(DescTools) compare.methods <- function(func) { plot_table(func, xlab = 'Y', ylab = 'X', value.cex = 0.9, col="seagreen", main = expression("Function:"~italic(Y%~~%f(X)))) plot_table(t(func), xlab = 'X', ylab = 'Y', value.cex = 0.9, col="firebrick1", main = expression("Inverse non-function:"~ italic(X != f^-1*(Y)))) sample.matrix <- Untable(func) x <- as.numeric(sample.matrix[, 1]) y <- as.numeric(sample.matrix[, 2]) H.y.given.x <- condentropy( y, x ) H.x.given.y <- condentropy( x, y ) if(H.y.given.x < H.x.given.y) { cat("Conditional entropy picked CORRECT direction X to Y: H(Y|X) =", format(H.y.given.x, digits = 2), "< H(X|Y) =", format(H.x.given.y, digits = 2), "\n") } else { cat("Conditional entropy picked *WRONG* direction Y to X: H(Y|X) =", format(H.y.given.x, digits = 2), ">= H(X|Y) =", format(H.x.given.y, digits = 2), "\n") } func.fc.pval <- fun.chisq.test(func)$p.value ifunc.fc.pval <- fun.chisq.test(t(func))$p.value if(func.fc.pval < ifunc.fc.pval) { cat("Original FunChisq picked CORRECT direction X to Y: p(X->Y) =", format(func.fc.pval, digits = 2), "< p(Y->X) =", format(ifunc.fc.pval, digits = 2), "\n") } else { cat("Original FunChisq picked *WRONG* direction Y to X: p(X->Y) =", format(func.fc.pval, digits = 2), ">= p(Y->X) =", format(ifunc.fc.pval, digits = 2), "\n") } func.afc.pval <- fun.chisq.test(func, method="adapted")$p.value ifunc.afc.pval <- fun.chisq.test(t(func), method="adapted")$p.value if(func.afc.pval < ifunc.afc.pval) { cat("Adapted FunChisq picked CORRECT direction X to Y: p(X->Y) =", format(func.afc.pval, digits = 2), "< p(Y->X) =", format(ifunc.afc.pval, digits = 2), "\n") } else { cat("Adapted FunChisq picked *WRONG* direction Y to X: p(X->Y) =", format(func.afc.pval, digits = 2), ">= p(Y->X) =", format(ifunc.afc.pval, digits = 2), "\n") } }
In this example, all three methods performed correctly.
func <- matrix(c( 1, 5, 1, 1, 5, 1, 6, 1, 1 ), nrow=3, byrow=TRUE) compare.methods(func)
In this example, conditional entropy failed to detect the correct functional direction.
func <- matrix(c( 1, 1, 6, 1, 1, 1, 6, 1, 1, 6, 1, 1 ), nrow=3, byrow=TRUE) compare.methods(func)
In this example, the original FunChisq could not detect the correct functional direction.
func <- matrix(c( 1, 5, 1, 6, 4, 1 ), nrow=3, byrow=TRUE) compare.methods(func)
The $p$-value of the AdpFunChisq is also a good indicator of the strength of a functional relationship $Y=f(X)$ over an independent relationship where $X \perp Y$.
In this example, conditional entropy failed to prioritize a functional pattern over an empirically independent pattern which is also a constant function.
# approximately functional pattern func = matrix(c( 8, 0, 1, 0, 1, 8, 1, 1, 8, 1, 1, 0 ), nrow=3, ncol=4, byrow=TRUE) # empirically independent (constant) pattern const = matrix(c( 9, 0, 0, 0, 11, 0, 0, 0, 10, 0, 0, 0 ), nrow=3, ncol=4, byrow=TRUE) # calculate the AdpFunChisq P-value of 'func' and 'const' tables, lower the better. func_afc = fun.chisq.test(func, method="adapted") const_afc = fun.chisq.test(const, method="adapted") # calculate the Conditional Entropy scores of 'func' and 'const' tables, lower the better. sample.matrix = Untable(func) x <- as.numeric(sample.matrix[,1]) y <- as.numeric(sample.matrix[,2]) func_H = condentropy(y, x) sample.matrix = Untable(const) x <- as.numeric(sample.matrix[,1]) y <- as.numeric(sample.matrix[,2]) const_H = condentropy(y, x) # plot table with AdpFunChisq and Conditional Entropy Scores # functional pattern plot_table( func, xlab = 'Y', ylab = 'X', value.cex = 0.9, cex.main=0.9, col="seagreen", main=paste0("AdpFunChisq P = ", format(func_afc$p.value, digits = 2), "\nConditional Entropy = ", format(func_H, digits = 2))) # constant pattern plot_table( const, xlab = 'Y', ylab = 'X', value.cex = 0.9, cex.main=0.9, col="firebrick1", main=paste0("AdpFunChisq P = ", format(const_afc$p.value, digits = 2), "\nConditional Entropy = ", format(const_H, digits = 2)))
Here, AdpFunChisq correctly rated the functional pattern ($p$-value=r func_afc$p.value
) stronger than the independent pattern ($p$-value=r const_afc$p.value
); while conditional entropy incorrectly rated the former more poorly (r func_H
) than the latter (r const_H
).
In functional inference model free, we recommend the adapted FunChisq test when tables are of different shape or dimension; otherwise, the original FunChisq test is adequate. We generally do not recommend conditional entropy for inferring non-constant function patterns.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.