Performs Person's chi-squared test, G-test, or William's corrected G-test to determine dependence between two nucleotide positions.
A DNAStringSet or RNAStringSet object.
Single integer values for positions to test for dependence.
a logical indicating whether to compute p-values by Monte Carlo simulation.
an integer specifying the number of replicates used in the Monte Carlo test.
The null and alternative hypotheses for this function are:
j are independent
Let O and E be the observed and expected probabilities for base pair
combinations at positions
j respectively. Then the
test statistics are calculated as:
stat = sum(abs(O - E)^2/E)
stat = 2 * sum(O * log(O/E))
stat = 2 * sum(O * log(O/E))/q, where q = 1 + ((df - 1)^2 - 1)/(6*length(x)*(df - 2))
Under the null hypothesis, these test statistics are approximately distributed chi-squared(df = ((distinct bases at i) - 1) * ((distinct bases at j) - 1)).
An htest object. See help(chisq.test) for more details.
Ellrott, K., Yang, C., Sladek, F.M., Jiang, T. (2002) "Identifying transcription factor binding sites through Markov chain optimations", Bioinformatics, 18 (Suppl. 2), S100-S109.
Sokal, R.R., Rohlf, F.J. (2003) "Biometry: The Principle and Practice of Statistics in Biological Research", W.H. Freeman and Company, New York.
Tomovic, A., Oakeley, E. (2007) "Position dependencies in transcription factor binding sites", Bioinformatics, 23, 933-941.
Williams, D.A. (1976) "Improved Likelihood ratio tests for complete contingency tables", Biometrika, 63, 33-37.
1 2 3 4
Loading required package: BiocGenerics Loading required package: parallel Attaching package: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following objects are masked from ‘package:stats’: IQR, mad, sd, var, xtabs The following objects are masked from ‘package:base’: anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which.max, which.min Loading required package: S4Vectors Loading required package: stats4 Attaching package: ‘S4Vectors’ The following object is masked from ‘package:base’: expand.grid Loading required package: IRanges Loading required package: XVector Attaching package: ‘Biostrings’ The following object is masked from ‘package:base’: strsplit Pearson's Chi-squared test of independence data: nucleotideFrequencyAt(HNF4alpha, c(1, 2)) X-squared = 19.073, df = 9, p-value = 0.02458 Log likelihood ratio (G-test) test of independence without correction data: nucleotideFrequencyAt(HNF4alpha, c(1, 2)) Log likelihood ratio statistic (G) = 17.261, X-squared df = 9, p-value = 0.04478 Log likelihood ratio (G-test) test of independence with Williams' correction data: nucleotideFrequencyAt(HNF4alpha, c(1, 2)) Log likelihood ratio statistic (G) = 10.806, X-squared df = 9, p-value = 0.2892
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.