RenDim: Renyi's Generalized Dimensions

Description Usage Arguments Details Value Author(s) References Examples

View source: R/RenDim.R

Description

Estimates Rényi's generalized dimensions (or Rényi's dimensions of qth order). It is mainly for q=2 that the result is used as an estimate of the intrinsic dimension of data.

Usage

1
RenDim(X, scaleQ=1:5, qMin=2, qMax=2)

Arguments

X

A N x E matrix, data.frame or data.table where N is the number of data points and E is the number of variables (or features). Each variable is rescaled to the [0,1] interval by the function.

scaleQ

A vector (at least two values). It contains the values of l^(-1) chosen by the user (by default: scaleQ = 1:5).

qMin

The minimum value of q (by default: qMin = 2).

qMax

The maximum value of q (by default: qMax = 2).

Details

  1. l is the edge length of the grid cells (or quadrats). Since the variables (and consenquently the grid) are rescaled to the [0,1] interval, l is equal to 1 for a grid consisting of only one cell.

  2. l^(-1) is the number of grid cells (or quadrats) along each axis of the Euclidean space in which the data points are embedded.

  3. l^(-1) is equal to Q^(1/E) where Q is the number of grid cells and E is the number of variables (or features).

  4. l^(-1) is directly related to delta (see References).

  5. delta is the diagonal length of the grid cells.

Value

A list of two elements:

  1. a data.frame containing the value of Rényi's information of qth order (computed using the natural logarithm) for each value of ln(delta) and q. The values of ln(delta) are provided with regard to the [0,1] interval.

  2. a data.frame containing the value of Dq for each value of q.

Author(s)

Jean Golay jeangolay@gmail.com

References

C. Traina Jr., A. J. M. Traina, L. Wu and C. Faloutsos (2000). Fast feature selection using fractal dimension. Proceedings of the 15th Brazilian Symposium on Databases (SBBD 2000), João Pessoa (Brazil).

E. P. M. De Sousa, C. Traina Jr., A. J. M. Traina, L. Wu and C. Faloutsos (2007). A fast and effective method to find correlations among attributes in databases, Data Mining and Knowledge Discovery 14(3):367-407.

J. Golay and M. Kanevski (2015). A new estimator of intrinsic dimension based on the multipoint Morisita index, Pattern Recognition 48 (12):4070–4081.

H. Hentschel and I. Procaccia (1983). The infinite number of generalized dimensions of fractals and strange attractors, Physica D 8(3):435-444.

Examples

1
2
3
4
5
6
7
sim_dat <- SwissRoll(1000)

scaleQ <- 1:15 # It starts with a grid of 1^E cell (or quadrat).
               # It ends with a grid of 15^E cells (or quadrats).
qRI_ID <- RenDim(sim_dat[,c(1,2)], scaleQ[5:15])

print(paste("The ID estimate is equal to",round(qRI_ID[[1]][1,2],2)))

jeangolay/IDmining documentation built on May 6, 2021, 10:49 a.m.