tableBoots | R Documentation |

Parametric Bootstrap of `n x m`

Contingency independence
test. It is assumed that the provided contingency table summarizes the
results where two or more groups are compared and the outcome is a
categorical variable, i.e., disease vs. no disease, pass vs. fail, treated
patient vs. control group, etc.

The goodness of fit statistic is the root-mean-square statistic (RMST) or Hellinger divergence, as proposed by Perkins et al. [1, 2]. Hellinger divergence (HD) is computed as proposed in [3].

```
tableBoots(
x,
stat = c("rmst", "hd", "chisq", "all"),
out.stat = FALSE,
num.permut = 100
)
```

`x` |
A numerical matrix corresponding to cross tabulation |

`stat` |
Statistic to be used in the testing: 'rmst','hdiv', or 'all'. |

`out.stat` |
logical(1). Whether to return the values of the statistics used: the bootstrap mean and the original value estimated. |

`num.permut` |
Number of permutations. |

For goodness-of-fit the following null hypothesis is tested
`H_\theta: p = p(\theta)`

To conduct a single simulation, we perform the following three-step
procedure [1,2]:

To generate m i.i.d. draws according to the model distribution

`p(\theta)`

, where`\theta'`

is the estimate calculated from the experimental data,To estimate the parameter

`\theta`

from the data generated in Step 1, obtaining a new estimate`\theta`

est.To calculate the statistic under consideration (HD, RMST), using the data generated in Step 1 and taking the model distribution to be

`\theta`

est, where`\theta`

est is the estimate calculated in Step 2 from the data generated in Step 1.

After conducting many such simulations, the confidence level for
rejecting the null hypothesis is the fraction of the statistics
calculated in step 3 that are less than the statistic calculated from
the empirical data. The significance level `\alpha`

is the same as a
confidence level of `1-\alpha`

.

A p-value probability

Robersy Sanchez (https://genomaths.com).

Perkins W, Tygert M, Ward R. Chi^2 and Classical Exact Tests Often Wildly Misreport Significance; the Remedy Lies in Computers [Internet]. Uploaded to ArXiv. 2011. Report No.: arXiv:1108.4126v2.

Perkins, W., Tygert, M. & Ward, R. Computing the confidence levels or a root-mean square test of goodness-of-fit. 217, 9072-9084 (2011).

Basu, A., Mandal, A. & Pardo, L. Hypothesis testing for two discrete populations based on the Hellinger distance. Stat. Probab. Lett. 80, 206-214 (2010).

```
set.seed(123)
## The numbers of methylated and unmethylted "read-counts" targeting DNA
## cytosine sites are the typical raw data resulting from bisulfate
## sequencing experiments (BiSeq). Methylation analysis requires for the
## application of some statistical tests to identify differentially
## methylated sites/ positions (DMSs or DMPs). BiSeq experiments on humans
## are relatively expensive (still in 2020) and some researcher is willing
## to accept a site coverage (total) of 4 read-counts as threshold for a
## site to be included in the analysis. Let's suppose that at a given site
## we have the following contingency table:
site_res <- matrix(c(3, 1, 1, 3),
nrow = 2,
dimnames = list(
status = c("ctrl", "treat"),
treat = c("meth", "unmeth")
)
)
site_res
## The application of classical test variants yield the following results
fisher.test(site_res)$p.value
fisher.test(site_res, alternative = "greater")$p.value
fisher.test(site_res, simulate.p.value = TRUE)$p.value
chisq.test(site_res)$p.value
chisq.test(site_res, simulate.p.value = TRUE)$p.value
## That is, these test did not detect differences between the read-counts
## for control and the treated patients. 'tableBoots' function suggests
## that the site is borderline.
tableBoots(site_res, stat = "all", num.permut = 1e3)
## A slight change in the read counts identify a meaningful difference
## not detected by the classicla tests
site_res <- matrix(c(3, 0, 1, 4),
nrow = 2,
dimnames = list(
status = c("ctrl", "treat"),
treat = c("meth", "unmeth")
)
)
site_res
set.seed(23)
tableBoots(site_res, stat = "all", num.permut = 1e3)
fisher.test(site_res)$p.value
fisher.test(site_res, alternative = "greater")$p.value
fisher.test(site_res, simulate.p.value = TRUE)$p.value
chisq.test(site_res)$p.value
chisq.test(site_res, simulate.p.value = TRUE)$p.value
## Now, let's suppose that we want to test whether methylation status from
## two DNA sites the same differentially methylated region (DMR)
## are independent, i.e. whether the treatment affected the methylation
## status of the two sites. In this case, the counts grouped into four
## categories.
set.seed(1)
site_res <- matrix(c(3, 1, 1, 3, 3, 0, 1, 4),
nrow = 2, byrow = FALSE,
dimnames = list(
status = c("ctrl", "treat"),
treat = c(
"meth.1", "unmeth.1",
"meth.2", "unmeth.2"
)
)
)
site_res
## Chi-squared from the R package 'stats'
chisq.test(site_res)$p.bvalue
chisq.test(site_res, simulate.p.value = TRUE, B = 2e3)$p.value
tableBoots(site_res, stat = "all", num.permut = 999)
## Results above are in border. If we include, third site,
## then sinces would different.
site_res <- matrix(c(3, 1, 1, 3, 3, 0, 1, 4, 4, 0, 1, 4),
nrow = 2, byrow = FALSE,
dimnames = list(
status = c("ctrl", "treat"),
treat = c(
"meth.1", "unmeth.1",
"meth.2", "unmeth.2",
"meth.3", "unmeth.3"
)
)
)
site_res
## That is, we have not reason to believe that the observed methylation
## levels in the treatment are not independent
chisq.test(site_res)$p.value
chisq.test(site_res, simulate.p.value = TRUE, B = 2e3)$p.value
tableBoots(site_res, stat = "all", num.permut = 999)
```

genomaths/usefr documentation built on April 18, 2023, 3:35 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.