# MultiFIT: Multiscale Fisher's Independence Test for Multivariate... In MultiFit: Multiscale Fisher's Independence Test for Multivariate Dependence

## Description

Perform multiscale test of independence for multivariate vectors. See vignettes for further examples.

## Usage

 ```1 2 3 4 5 6 7``` ```MultiFIT(xy, x = NULL, y = NULL, p_star = NULL, R_max = NULL, R_star = 1, rank.transform = TRUE, ranking.approximation = FALSE, M = 10, apply.stopping.rule = FALSE, alpha = 0.05, test.method = "Fisher", correct = TRUE, min.tbl.tot = 25L, min.row.tot = 10L, min.col.tot = 10L, p.adjust.methods = c("H", "Hcorrected"), compute.all.holm = TRUE, return.all.pvs = TRUE, verbose = FALSE) ```

## Arguments

 `xy` A list, whose first element corresponds to the matrix x as below, and its second element corresponds to the matrix y as below. If `xy` is not specified, `x` and `y` need to be assigned. `x` A matrix, number of columns = dimension of random vector, number of rows = number of observations. `y` A matrix, number of columns = dimension of random vector, number of rows = number of observations. `p_star` Numeric, cuboids associated with tests whose `p`-value is below `p_star` will be halved and further tested. `R_max` A positive integer (or Inf), the maximal number of resolutions to scan (algorithm will stop at a lower resolution if all tables in it do not meet the criteria specified at `min.tbl.tot`, `min.row.tot` and `min.col.tot`) `R_star` A positive integer, if set to an integer between 0 and `R_max`, all tests up to and including resolution `R_star` will be performed (algorithm will stop at a lower resolution than requested if all tables in it do not meet the criteria specified at `min.tbl.tot`, `min.row.tot` and `min.col.tot`). For higher resolutions only the children of tests with `p`-value lower than `p_star` will be considered. `rank.transform` Logical, if `TRUE`, marginal rank transform is performed on all margins of `x` and `y`. If `FALSE`, all margins are scaled to 0-1 scale. When `FALSE`, the average and top statistics of the negative logarithm of the `p`-values are only computed for the univariate case. `ranking.approximation` Logical, if `FALSE`, select only tests with `p`-values more extreme than `p_star` to halve and further test. FWER control not guaranteed. If `TRUE`, choose at each resolution the `M` tests with the most extreme `p`-values to further halve and test. `M` A positive integer (or Inf), the number of top ranking tests to continue to split at each resolution. FWER control not guaranteed for this method. `apply.stopping.rule` Logical. If TRUE, an adjusted `p`-value is computed for each resolution, `alpha` Numeric. Threshold below which resolution-specific `p`-values trigger early stopping. `test.method` String, choose "Fisher" for Fisher's exact test (slowest), "chi.sq" for Chi-squared test, "LR" for likelihood-ratio test and "norm.approx" for approximating the hypergeometric distribution with a normal distribution (fastest). `correct` Logical, if `TRUE` compute mid-p corrected `p`-values for Fisher's exact test, or Yates corrected `p`-values for the Chi-squared test, or Williams corrected `p`-values for the likelihood-ratio test. `min.tbl.tot` Non-negative integer, the minimal number of observations per table below which a `p`-value for a given table will not be computed. `min.row.tot` Non-negative integer, the minimal number of observations for row totals in the 2x2 contingency tables below which a contingency table will not be tested. `min.col.tot` Non-negative integer, the minimal number of observations for column totals in the 2x2 contingency tables below which a contingency table will not be tested. `p.adjust.methods` String, choose between "H" for Holm, "Hcorrected" for Holm with the correction as specified in `correct`. `compute.all.holm` Logical, if `FALSE`, only global `p`-value is computed (may be a little faster when any tests are performed). If `TRUE` adjusted `p`-values are computed for all tests. `return.all.pvs` Logical, if TRUE, a data frame with all `p`-values is returned (not applicable when stopping rule is applied) `verbose` Logical.

## Value

`p.values.holistic`, a named numerical vector containing the holistic `p`-values of for the global null hypothesis (i.e. x independent of y).

`p.values.resolution.specific`, a named numerical vector containing the reslution specific `p`-values of for the global null hypothesis (i.e. x independent of y).

`res.by.res.pvs`, a dta frame that contains the raw and Bonferroni adjusted resolution specific `p`-values.

`all.pvs`, a data frame that contains all `p`-values and adjusted `p`-values that are computed. Returned if `return.all.pvs` is `TRUE`.

`all`, a nested list. Each entry is named and contains data about a resolution that was tested. Each resolution is a list in itself, with `cuboids`, a summary of all tested cuboids in a resolution, `tables`, a summary of all 2x2 contingency tables in a resolution, `pv`, a numerical vector containing the `p`-values from the tests of independence on 2x2 contingency table in `tables` that meet the criteria defined by `min.tbl.tot`, `min.row.tot` and `min.col.tot`. The length of `pv` is equal to the number of rows of `tables`. `pv.correct`, similar to the above `pv`, corrected `p`-values are computed and returned when `correct` is `TRUE`. `rank.tests`, logical vector that indicates whether or not a test was ranked among the top `M` tests in a resolution. The length of `rank.tests` is equal to the number of rows of `tables`. `parent.cuboids`, an integer vector, indicating which cuboids in a resolution are associated with the ranked tests, and will be further halved in the next higher resolution. `parent.tests`, a logical vector of the same length as the number of rows of `tables`, indicating whether or not a test was chosen as a parent test (same tests may have multiple children).

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11``` ```set.seed(1) n = 300 Dx = Dy = 2 x = matrix(0, nrow = n, ncol = Dx) y = matrix(0, nrow = n, ncol = Dy) x[,1] = rnorm(n) x[,2] = runif(n) y[,1] = rnorm(n) y[,2] = sin(5 * pi * x[ , 2]) + 1 / 5 * rnorm(n) fit = MultiFIT(x = x, y = y, verbose = TRUE) w = MultiSummary(x = x, y = y, fit = fit, alpha = 0.0001) ```

MultiFit documentation built on Jan. 18, 2022, 9:06 a.m.