dinucleotideFrequencyTest: Pearson's chi-squared Test and G-tests for String Position...
In Biostrings: Efficient manipulation of biological strings

Description Usage Arguments Details Value Author(s) References See Also Examples

Performs Person's chi-squared test, G-test, or William's corrected G-test to determine dependence between two nucleotide positions.

1 2	dinucleotideFrequencyTest(x, i, j, test = c("chisq", "G", "adjG"), simulate.p.value = FALSE, B = 2000)

`x`	A DNAStringSet or RNAStringSet object.
`i, j`	Single integer values for positions to test for dependence.
`test`	One of `"chisq"` (Person's chi-squared test), `"G"` (G-test), or `"adjG"` (William's corrected G-test). See Details section.
`simulate.p.value`	a logical indicating whether to compute p-values by Monte Carlo simulation.
`B`	an integer specifying the number of replicates used in the Monte Carlo test.

The null and alternative hypotheses for this function are:

H0:: positions i and j are independent
H1:: otherwise

Let O and E be the observed and expected probabilities for base pair combinations at positions i and j respectively. Then the test statistics are calculated as:

test="chisq":: stat = sum(abs(O - E)^2/E)
test="G":: stat = 2 * sum(O * log(O/E))
test="adjG":: stat = 2 * sum(O * log(O/E))/q, where q = 1 + ((df - 1)^2 - 1)/(6*length(x)*(df - 2))

Under the null hypothesis, these test statistics are approximately distributed chi-squared(df = ((distinct bases at i) - 1) * ((distinct bases at j) - 1)).

An htest object. See help(chisq.test) for more details.

P. Aboyoun

Ellrott, K., Yang, C., Sladek, F.M., Jiang, T. (2002) "Identifying transcription factor binding sites through Markov chain optimations", Bioinformatics, 18 (Suppl. 2), S100-S109.

Sokal, R.R., Rohlf, F.J. (2003) "Biometry: The Principle and Practice of Statistics in Biological Research", W.H. Freeman and Company, New York.

Tomovic, A., Oakeley, E. (2007) "Position dependencies in transcription factor binding sites", Bioinformatics, 23, 933-941.

Williams, D.A. (1976) "Improved Likelihood ratio tests for complete contingency tables", Biometrika, 63, 33-37.

nucleotideFrequencyAt, XStringSet-class, chisq.test

  data(HNF4alpha)
  dinucleotideFrequencyTest(HNF4alpha, 1, 2)
  dinucleotideFrequencyTest(HNF4alpha, 1, 2, test = "G")
  dinucleotideFrequencyTest(HNF4alpha, 1, 2, test = "adjG")

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

Loading required package: IRanges
Loading required package: XVector

Attaching package: ‘Biostrings’

The following object is masked from ‘package:base’:

    strsplit


	Pearson's Chi-squared test of independence

data:  nucleotideFrequencyAt(HNF4alpha, c(1, 2))
X-squared = 19.073, df = 9, p-value = 0.02458


	Log likelihood ratio (G-test) test of independence without correction

data:  nucleotideFrequencyAt(HNF4alpha, c(1, 2))
Log likelihood ratio statistic (G) = 17.261, X-squared df = 9, p-value
= 0.04478


	Log likelihood ratio (G-test) test of independence with Williams'
	correction

data:  nucleotideFrequencyAt(HNF4alpha, c(1, 2))
Log likelihood ratio statistic (G) = 10.806, X-squared df = 9, p-value
= 0.2892

Biostrings documentation built on Nov. 8, 2020, 11:12 p.m.

Biostrings index

README.md A short presentation of the basic classes defined in Biostrings 2 Biostrings Quick Overview Handling probe sequence information Multiple Alignments Pairwise Sequence Alignments

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Biostrings
Efficient manipulation of biological strings

dinucleotideFrequencyTest: Pearson's chi-squared Test and G-tests for String Position...
In Biostrings: Efficient manipulation of biological strings

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to dinucleotideFrequencyTest in Biostrings...

R Package Documentation

Browse R Packages

We want your feedback!

Biostrings Efficient manipulation of biological strings

dinucleotideFrequencyTest: Pearson's chi-squared Test and G-tests for String Position... In Biostrings: Efficient manipulation of biological strings

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to dinucleotideFrequencyTest in Biostrings...

R Package Documentation

Browse R Packages

We want your feedback!

Biostrings
Efficient manipulation of biological strings

dinucleotideFrequencyTest: Pearson's chi-squared Test and G-tests for String Position...
In Biostrings: Efficient manipulation of biological strings