# MINDID: The (Multipoint) Morisita Index for Intrinsic Dimension... In IDmining: Intrinsic Dimension for Data Mining

## Description

Estimates the intrinsic dimension of data using the Morisita estimator of intrinsic dimension.

## Usage

 `1` ```MINDID(X, scaleQ=1:5, mMin=2, mMax=2) ```

## Arguments

 `X` A N x E `matrix`, `data.frame` or `data.table` where N is the number of data points and E is the number of variables (or features). Each variable is rescaled to the [0,1] interval by the function. `scaleQ` A vector (at least two values). It contains the values of l^(-1) chosen by the user (by default: `scaleQ = 1:5`). `mMin` The minimum value of m (by default: `mMin = 2`). `mMax` The maximum value of m (by default: `mMax = 2`).

## Details

1. l is the edge length of the grid cells (or quadrats). Since the variables (and consenquently the grid) are rescaled to the [0,1] interval, l is equal to 1 for a grid consisting of only one cell.

2. l^(-1) is the number of grid cells (or quadrats) along each axis of the Euclidean space in which the data points are embedded.

3. l^(-1) is equal to Q^(1/E) where Q is the number of grid cells and E is the number of variables (or features).

4. l^(-1) is directly related to delta (see References).

5. delta is the diagonal length of the grid cells.

## Value

A list of two elements:

1. a `data.frame` containing the ln value of the m-Morisita index for each value of ln(delta) and m. The values of ln(delta) are provided with regard to the [0,1] interval.

2. a `data.frame` containing the values of Sm and Mm for each value of m.

## Author(s)

Jean Golay jeangolay@gmail.com

## References

J. Golay and M. Kanevski (2015). A new estimator of intrinsic dimension based on the multipoint Morisita index, Pattern Recognition 48 (12):4070–4081.

J. Golay, M. Leuenberger and M. Kanevski (2017). Feature selection for regression problems based on the Morisita estimator of intrinsic dimension, Pattern Recognition 70:126–138.

J. Golay and M. Kanevski (2017). Unsupervised feature selection based on the Morisita estimator of intrinsic dimension, Knowledge-Based Systems 135:125-134.

J. Golay, M. Leuenberger and M. Kanevski (2015). Morisita-based feature selection for regression problems. Proceedings of the 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges (Belgium).

## Examples

 ```1 2 3 4 5 6 7``` ```sim_dat <- SwissRoll(1000) scaleQ <- 1:15 # It starts with a grid of 1^E cell (or quadrat). # It ends with a grid of 15^E cells (or quadrats). mMI_ID <- MINDID(sim_dat, scaleQ[5:15]) print(paste("The ID estimate is equal to",round(mMI_ID[][1,3],2))) ```

### Example output

``` "The ID estimate is equal to 2.02"
```

IDmining documentation built on May 3, 2021, 9:08 a.m.