dtm: Distance to Measure Function

View source: R/dtm.R

dtmR Documentation

Distance to Measure Function

Description

The function dtm computes the "distance to measure function" on a set of points Grid, using the uniform empirical measure on a set of points X. Given a probability measure P, The distance to measure function, for each y \in R^d, is defined by

d_{m0}(y) = \left(\frac{1}{m0}\int_0^{m0} ( G_y^{-1}(u))^{r} du\right)^{1/r},

where G_y(t) = P( \Vert X-y \Vert \le t), and m0 \in (0,1) and r \in [1,\infty) are tuning parameters. As m0 increases, DTM function becomes smoother, so m0 can be understood as a smoothing parameter. r affects less but also changes DTM function as well. The DTM can be seen as a smoothed version of the distance function. See Details and References.

Given X=\{x_1, \dots, x_n\}, the empirical version of the distance to measure is

\hat d_{m0}(y) = \left(\frac{1}{k} \sum_{x_i \in N_k(y)} \Vert x_i-y \Vert^{r}\right)^{1/r},

where k= \lceil m0 * n \rceil and N_k(y) is the set containing the k nearest neighbors of y among x_1, \ldots, x_n.

Usage

dtm(X, Grid, m0, r = 2, weight = 1)

Arguments

X

an n by d matrix of coordinates of points used to construct the uniform empirical measure for the distance to measure, where n is the number of points and d is the dimension.

Grid

an m by d matrix of coordinates of points where the distance to measure is computed, where m is the number of points in Grid and d is the dimension.

m0

a numeric variable for the smoothing parameter of the distance to measure. Roughly, m0 is the the percentage of points of X that are considered when the distance to measure is computed for each point of Grid. The value of m0 should be in (0,1).

r

a numeric variable for the tuning parameter of the distance to measure. The value of r should be in [1,\infty), and the default value is 2.

weight

either a number, or a vector of length n. If it is a number, then same weight is applied to each points of X. If it is a vector, weight represents weights of each points of X. The default value is 1.

Details

See (Chazal, Cohen-Steiner, and Merigot, 2011, Definition 3.2) and (Chazal, Massart, and Michel, 2015, Equation (2)) for a formal definition of the "distance to measure" function.

Value

The function dtm returns a vector of length m (the number of points stored in Grid) containing the value of the distance to measure function evaluated at each point of Grid.

Author(s)

Jisu Kim and Fabrizio Lecci

References

Chazal F, Cohen-Steiner D, Merigot Q (2011). "Geometric inference for probability measures." Foundations of Computational Mathematics 11.6, 733-751.

Chazal F, Massart P, Michel B (2015). "Rates of convergence for robust geometric inference."

Chazal F, Fasy BT, Lecci F, Michel B, Rinaldo A, Wasserman L (2014). "Robust Topological Inference: Distance-To-a-Measure and Kernel Distance." Technical Report.

See Also

kde, kernelDist, distFct

Examples

## Generate Data from the unit circle
n <- 300
X <- circleUnif(n)

## Construct a grid of points over which we evaluate the function
by <- 0.065
Xseq <- seq(-1.6, 1.6, by = by)
Yseq <- seq(-1.7, 1.7, by = by)
Grid <- expand.grid(Xseq, Yseq)

## distance to measure
m0 <- 0.1
DTM <- dtm(X, Grid, m0)

TDA documentation built on May 29, 2024, 1:28 a.m.