Description Usage Arguments Details Value Author(s) References See Also Examples
Performs Person's chi-squared test, G-test, or William's corrected G-test to determine dependence between two nucleotide positions.
1 2 | dinucleotideFrequencyTest(x, i, j, test = c("chisq", "G", "adjG"),
simulate.p.value = FALSE, B = 2000)
|
x |
A DNAStringSet or RNAStringSet object. |
i, j |
Single integer values for positions to test for dependence. |
test |
One of |
simulate.p.value |
a logical indicating whether to compute p-values by Monte Carlo simulation. |
B |
an integer specifying the number of replicates used in the Monte Carlo test. |
The null and alternative hypotheses for this function are:
positions i
and j
are independent
otherwise
Let O and E be the observed and expected probabilities for base pair
combinations at positions i
and j
respectively. Then the
test statistics are calculated as:
test="chisq"
: stat = sum(abs(O - E)^2/E)
test="G"
: stat = 2 * sum(O * log(O/E))
test="adjG"
: stat = 2 * sum(O * log(O/E))/q, where q = 1 + ((df - 1)^2 - 1)/(6*length(x)*(df - 2))
Under the null hypothesis, these test statistics are approximately distributed chi-squared(df = ((distinct bases at i) - 1) * ((distinct bases at j) - 1)).
An htest object. See help(chisq.test) for more details.
P. Aboyoun
Ellrott, K., Yang, C., Sladek, F.M., Jiang, T. (2002) "Identifying transcription factor binding sites through Markov chain optimations", Bioinformatics, 18 (Suppl. 2), S100-S109.
Sokal, R.R., Rohlf, F.J. (2003) "Biometry: The Principle and Practice of Statistics in Biological Research", W.H. Freeman and Company, New York.
Tomovic, A., Oakeley, E. (2007) "Position dependencies in transcription factor binding sites", Bioinformatics, 23, 933-941.
Williams, D.A. (1976) "Improved Likelihood ratio tests for complete contingency tables", Biometrika, 63, 33-37.
nucleotideFrequencyAt
,
XStringSet-class,
chisq.test
1 2 3 4 | data(HNF4alpha)
dinucleotideFrequencyTest(HNF4alpha, 1, 2)
dinucleotideFrequencyTest(HNF4alpha, 1, 2, test = "G")
dinucleotideFrequencyTest(HNF4alpha, 1, 2, test = "adjG")
|
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames,
dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:base’:
expand.grid
Loading required package: IRanges
Loading required package: XVector
Attaching package: ‘Biostrings’
The following object is masked from ‘package:base’:
strsplit
Pearson's Chi-squared test of independence
data: nucleotideFrequencyAt(HNF4alpha, c(1, 2))
X-squared = 19.073, df = 9, p-value = 0.02458
Log likelihood ratio (G-test) test of independence without correction
data: nucleotideFrequencyAt(HNF4alpha, c(1, 2))
Log likelihood ratio statistic (G) = 17.261, X-squared df = 9, p-value
= 0.04478
Log likelihood ratio (G-test) test of independence with Williams'
correction
data: nucleotideFrequencyAt(HNF4alpha, c(1, 2))
Log likelihood ratio statistic (G) = 10.806, X-squared df = 9, p-value
= 0.2892
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.