bootstrap2x2: bootstrap2x2

View source: R/bootstrap2x2.R

bootstrap2x2R Documentation

bootstrap2x2

Description

Parametric Bootstrap of 2x2 Contingence independence test. The goodness of fit statistic is the root-mean-square statistic (RMST) or Hellinger divergence, as proposed by Perkins et al. [1, 2]. Hellinger divergence (HD) is computed as proposed in [3].

Usage

bootstrap2x2(x, stat = "rmst", num.permut = 100)

Arguments

x

A numerical matrix corresponding to cross tabulation (2x2) table (contingency table).

stat

Statistic to be used in the testing: 'rmst','hdiv', or 'all'.

num.permut

Number of permutations.

Details

For goodness-of-fit the following null hypothesis is tested H_\theta: p = p(\theta) To conduct a single simulation, we perform the following three-step procedure [1,2]:

  1. To generate m i.i.d. draws according to the model distribution p(\theta), where \theta' is the estimate calculated from the experimental data,

  2. To estimate the parameter \theta from the data generated in Step 1, obtaining a new estimate \thetaest.

  3. To calculate the statistic under consideration (HD, RMST), using the data generated in Step 1 and taking the model distribution to be \thetaest, where \thetaest is the estimate calculated in Step 2 from the data generated in Step 1.

After conducting many such simulations, the confidence level for rejecting the null hypothesis is the fraction of the statistics calculated in step 3 that are less than the statistic calculated from the empirical data. The significance level \alpha is the same as a confidence level of 1-\alpha.

Value

A p-value probability

References

  1. Perkins W, Tygert M, Ward R. Chi^2 and Classical Exact Tests Often Wildly Misreport Significance; the Remedy Lies in Computers [Internet]. Uploaded to ArXiv. 2011. Report No.: arXiv:1108.4126v2.

  2. Perkins, W., Tygert, M. & Ward, R. Computing the confidence levels or a root-mean square test of goodness-of-fit. 217, 9072-9084 (2011).

  3. Basu, A., Mandal, A. & Pardo, L. Hypothesis testing for two discrete populations based on the Hellinger distance. Stat. Probab. Lett. 80, 206-214 (2010).

Examples

    set.seed(123)
    TeaTasting = matrix(c(8, 350, 2, 20), nrow = 2,
                        dimnames = list(Guess = c('Milk', 'Tea'),
                        Truth = c('Milk', 'Tea')))
    ## Small num.permut for test's speed sake
    bootstrap2x2( TeaTasting, stat = 'all', num.permut = 100 )

genomaths/MethylIT.utils documentation built on July 4, 2023, 12:05 a.m.