# knn_mi: kNN Mutual Information Estimators In rmi: Mutual Information Estimators

## Description

Computes mutual information based on the distribution of nearest neighborhood distances. Method available are KSG1 and KSG2 as described by Kraskov, et. al (2004) and the Local Non-Uniformity Corrected (LNC) KSG as described by Gao, et. al (2015). The LNC method is based on KSG2 but with PCA volume corrections to adjust for observed non-uniformity of the local neighborhood of each point in the sample.

## Usage

 1 knn_mi(data, splits, options)

## Arguments

 data Matrix of sample observations, each row is an observation. splits A vector that describes which sets of columns in data to compute the mutual information between. For example, to compute mutual information between two variables use splits = c(1,1). To compute redundancy among multiple random variables use splits = rep(1,ncol(data)). To compute the mutual information between two random vector list the dimensions of each vector. options A list that specifies the estimator and its necessary parameters (see details).

## Details

Current available methods are LNC, KSG1 and KSG2.

For KSG1 use: options = list(method = "KSG1", k = 5)

For KSG2 use: options = list(method = "KSG2", k = 5)

For LNC use: options = list(method = "LNC", k = 10, alpha = 0.65), order needed k > ncol(data).

## Author

Isaac Michaud, North Carolina State University, ijmichau@ncsu.edu

## References

Gao, S., Ver Steeg G., & Galstyan A. (2015). Efficient estimation of mutual information for strongly dependent variables. Artificial Intelligence and Statistics: 277-286.

Kraskov, A., Stogbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical review E 69(6): 066138.

## Examples

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 set.seed(123) x <- rnorm(1000) y <- x + rnorm(1000) knn_mi(cbind(x,y),c(1,1),options = list(method = "KSG2", k = 6)) set.seed(123) x <- rnorm(1000) y <- 100*x + rnorm(1000) knn_mi(cbind(x,y),c(1,1),options = list(method = "LNC", alpha = 0.65, k = 10)) #approximate analytic value of mutual information -0.5*log(1-cor(x,y)^2) z <- rnorm(1000) #redundancy I(x;y;z) is approximately the same as I(x;y) knn_mi(cbind(x,y,z),c(1,1,1),options = list(method = "LNC", alpha = c(0.5,0,0,0), k = 10)) #mutual information I((x,y);z) is approximately 0 knn_mi(cbind(x,y,z),c(2,1),options = list(method = "LNC", alpha = c(0.5,0.65,0), k = 10))

rmi documentation built on May 2, 2019, 3:27 a.m.