| dhsic.test | R Documentation |
Independence test based on dHSIC
dhsic.test(X, Y, K, alpha = 0.05, method = "permutation",
kernel = "gaussian", B = 1000, pairwise = FALSE, bandwidth = 1,
matrix.input = FALSE)
X |
either a list of at least two numeric matrices or a
single numeric matrix. The rows of a matrix correspond to the
observations of a variable. It is always required that there are
an equal number of observations for all variables (i.e. all
matrices have to have the same number of rows). If |
Y |
a numeric matrix if |
K |
a list of the gram matrices corresponding to each
variable. If |
alpha |
a numeric value in (0,1) specifying the confidence level of the hypothesis test. |
method |
a character string specifying the type of hypothesis test used. The available options are: "gamma" (gamma approximation based test), "permutation" (permutation test (slow)), "bootstrap" (bootstrap test (slow)) and "eigenvalue" (eigenvalue based test). |
kernel |
a vector of character strings specifying the kernels
for each variable. There exist two pre-defined kernels:
"gaussian" (Gaussian kernel with median heuristic as bandwidth)
and "discrete" (discrete kernel). User defined kernels can also
be used by passing the function name as a string, which will
then be matched using |
B |
an integer value specifying the number of Monte-Carlo
iterations made in the permutation and bootstrap test. Only
relevant if |
pairwise |
a logical value indicating whether one should use HSIC with pairwise comparisons instead of dHSIC. Can only be true if there are more than two variables. |
bandwidth |
a numeric value specifying the size of the bandwidth used for the Gaussian kernel. Only used if kernel="gaussian.fixed". |
matrix.input |
a boolean. If |
Hypothesis test for finding statistically significant evidence of dependence between several variables. Uses the d-variable Hilbert Schmidt independence criterion (dHSIC) as measure of dependence. Several types of hypothesis tests are included. The null hypothesis (H_0) is that all variables are jointly independent.
The d-variable Hilbert Schmidt independence criterion is a direct extension of the standard Hilbert Schmidt independence criterion (HSIC) from two variables to an arbitrary number of variables. It is 0 if and only if the variables are jointly independent.
4 different statistical hypothesis tests are implemented all with
null hypothesis (H_0: X[[1]],...,X[[d]] are jointly
independent) and alternative hypothesis (H_A:
X[[1]],...,X[[d]] are not jointly independent):
1. Permutation test for dHSIC: exact level, slow 2. Bootstrap test
for dHSIC: pointwise asymptotic level and pointwise consistent,
slow 3. Gamma approximation based test for dHSIC: only
approximate, fast 4. Eigenvalue based test for dHSIC: pointwise
asymptotic level and pointwise consistent, medium
The null hypothesis is rejected if statistic is strictly
greater than crit.value.
If X is a list with d matrices, the function tests for
joint independence of the corresponding d random vectors. If
X is a matrix and matrix.input is "TRUE" the
functions tests the independence between the columns of
X. If X is a matrix and matrix.input is
"FALSE" then Y needs to be a matrix, too; in this case, the
function tests the (pairwise) independence between the
corresponding two random vectors.
For more details see the references.
A list containing the following components:
statistic |
the value of the test statistic |
crit.value |
critical value of the hypothesis test. The null
hypothesis (H_0: joint independence) is rejected if
|
p.value |
p-value of the hypothesis test, i.e. the
probability that a random version of the test statistic is greater
than |
time |
numeric
vector containing computation times. |
bandwidth |
bandwidth used during the computation. Only relevant if Gaussian kernel was used. |
Niklas Pfister and Jonas Peters
Gretton, A., K. Fukumizu, C. H. Teo, L. Song, B. Schölkopf and A. J. Smola (2007). A kernel statistical test of independence. In Advances in Neural Information Processing Systems (pp. 585-592).
Pfister, N., P. Bühlmann, B. Schölkopf and J. Peters (2018). Kernel-based Tests for Joint Independence. Journal of the Royal Statistical Society, Series B.
In order to only compute the test statistic without
p-values, use the function dhsic.
### pairwise independent but not jointly independent (pairwise HSIC vs dHSIC)
set.seed(0)
x <- matrix(rbinom(100,1,0.5),ncol=1)
y <- matrix(rbinom(100,1,0.5),ncol=1)
z <- matrix(as.numeric((x+y)==1)+rnorm(100),ncol=1)
X <- list(x,y,z)
dhsic.test(X, method="permutation",
kernel=c("discrete", "discrete", "gaussian"),
pairwise=TRUE, B=1000)$p.value
dhsic.test(X, method="permutation",
kernel=c("discrete", "discrete", "gaussian"),
pairwise=FALSE, B=1000)$p.value
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.