knn_mi: kNN Mutual Information Estimators

Description Usage Arguments Details Author References Examples

Description

Computes mutual information based on the distribution of nearest neighborhood distances. Method available are KSG1 and KSG2 as described by Kraskov, et. al (2004) and the Local Non-Uniformity Corrected (LNC) KSG as described by Gao, et. al (2015). The LNC method is based on KSG2 but with PCA volume corrections to adjust for observed non-uniformity of the local neighborhood of each point in the sample.

Usage

1
knn_mi(data, splits, options)

Arguments

data

Matrix of sample observations, each row is an observation.

splits

A vector that describes which sets of columns in data to compute the mutual information between. For example, to compute mutual information between two variables use splits = c(1,1). To compute redundancy among multiple random variables use splits = rep(1,ncol(data)). To compute the mutual information between two random vector list the dimensions of each vector.

options

A list that specifies the estimator and its necessary parameters (see details).

Details

Current available methods are LNC, KSG1 and KSG2.

For KSG1 use: options = list(method = "KSG1", k = 5)

For KSG2 use: options = list(method = "KSG2", k = 5)

For LNC use: options = list(method = "LNC", k = 10, alpha = 0.65), order needed k > ncol(data).

Author

Isaac Michaud, North Carolina State University, ijmichau@ncsu.edu

References

Gao, S., Ver Steeg G., & Galstyan A. (2015). Efficient estimation of mutual information for strongly dependent variables. Artificial Intelligence and Statistics: 277-286.

Kraskov, A., Stogbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical review E 69(6): 066138.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
set.seed(123)
x <- rnorm(1000)
y <- x + rnorm(1000)
knn_mi(cbind(x,y),c(1,1),options = list(method = "KSG2", k = 6))

set.seed(123)
x <- rnorm(1000)
y <- 100*x + rnorm(1000)
knn_mi(cbind(x,y),c(1,1),options = list(method = "LNC", alpha = 0.65, k = 10))
#approximate analytic value of mutual information
-0.5*log(1-cor(x,y)^2)

z <- rnorm(1000)
#redundancy I(x;y;z) is approximately the same as I(x;y)
knn_mi(cbind(x,y,z),c(1,1,1),options = list(method = "LNC", alpha = c(0.5,0,0,0), k = 10))
#mutual information I((x,y);z) is approximately 0
knn_mi(cbind(x,y,z),c(2,1),options = list(method = "LNC", alpha = c(0.5,0.65,0), k = 10))

rmi documentation built on May 2, 2019, 3:27 a.m.

Related to knn_mi in rmi...