# dmi: Calculate BCMI for categorical (discrete) data In mpmi: Mixed-Pair Mutual Information Estimators

## Description

This function calculates MI and BCMI between a set of discrete variables held as columns in a matrix. It also performs jackknife bias correction and provides a z-score for the hypothesis of no association. Also included are the *.pw functions that calculate MI between two vectors only. The *njk functions do not perform the jackknife and are therefore faster.

## Usage

 ```1 2 3 4``` ```dmi(dmat) dminjk(dmat) dmi.pw(disc1, disc2) dminjk.pw(disc1, disc2) ```

## Arguments

 `dmat` The data matrix. Each row is an observation and each column is a variable of interest. Should contain categorical data, all types of data will be coerced via factors to integers. `disc1` A vector for the pairwise version `disc2` A vector for the pairwise version

## Details

The results of dmi() are in many ways similar to a correlation matrix, with each row and column index corresponding to a given variable. dminjk() and dminjk.pw() just returns the MI values without performing the jackknife. The number of processor cores used can be changed by setting the environment variable "OMP_NUM_THREADS" before starting R.

## Value

Returns a list of 3 matrices each of size `ncol(dmat)` by `ncol(dmat)`

 `mi` The raw MI estimates. `bcmi` Jackknife bias corrected MI estimates (BCMI). These are each MI value minus the corresponding jackknife estimate of bias. `zvalues` Z-scores for each hypothesis that the corresponding bcmi value is zero. These have poor statistical properties but can be useful as a rough measure of the strength of association.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20``` ```data(cars) # Discretise the data first d <- cut(cars\$dist, breaks = 10) s <- cut(cars\$speed, breaks = 10) # Discrete MI values dmi.pw(s, d) # For comparison, analysed as continuous data: cmi.pw(cars\$dist, cars\$speed) # Exploring a group of categorical variables dat <- mtcars[, c("cyl","vs","am","gear","carb")] discresults <- dmi(dat) discresults # Plot the relative magnitude of the BCMI values diag(discresults\$bcmi) <- NA mp(discresults\$bcmi) ```

mpmi documentation built on May 30, 2017, 7:23 a.m.