# bootstrapBand: Bootstrap Confidence Band In TDA: Statistical Tools for Topological Data Analysis

## Description

The function `bootstrapBand` computes a uniform symmetric confidence band around a function of the data `X`, evaluated on a `Grid`, using the bootstrap algorithm. See Details and References.

## Usage

 ```1 2 3``` ```bootstrapBand( X, FUN, Grid, B = 30, alpha = 0.05, parallel = FALSE, printProgress = FALSE, weight = NULL, ...) ```

## Arguments

 `X` an n by d matrix of coordinates of points used by the function `FUN`, where n is the number of points and d is the dimension. `FUN` a function whose inputs are an n by d matrix of coordinates `X`, an m by d matrix of coordinates `Grid` and returns a numeric vector of length m. For example see `distFct`, `kde`, and `dtm` which compute the distance function, the kernel density estimator and the distance to measure over a grid of points, using the input `X`. `Grid` an m by d matrix of coordinates, where m is the number of points in the grid, at which `FUN` is evaluated. `B` the number of bootstrap iterations. `alpha` `bootstrapBand` returns a (`1-alpha`) confidence band. The default value is `0.05`. `parallel` logical: if `TRUE` the bootstrap iterations are parallelized, using the library `parallel`. The default value is `FALSE`. `printProgress` if `TRUE`, a progress bar is printed. The default value is `FALSE`. `weight` either NULL, a number, or a vector of length n. If it is NULL, weight is not used. If it is a number, then same weight is applied to each points of `X`. If it is a vector, `weight` represents weights of each points of `X`. The default value is `NULL`. `...` additional parameters for the function `FUN`.

## Details

First, the input function `FUN` is evaluated on the `Grid` using the original data `X`. Then, for `B` times, the bootstrap algorithm subsamples `n` points of `X` (with replacement), evaluates the function `FUN` on the `Grid` using the subsample, and computes the l_∞ distance between the original function and the bootstrapped one. The result is a sequence of `B` values. The (`1-alpha`) confidence band is constructed by taking the (`1-alpha`) quantile of these values.

## Value

The function `bootstrapBand` returns a list with the following elements:

 `width` number: (`1-alpha`) quantile of the values computed by the bootstrap algorithm. It corresponds to half of the width of the unfiorm confidence band; that is, `width` is the distance of the upper and lower limits of the band from the function evaluated using the original dataset `X`. `fun` a numeric vector of length m, storing the values of the input function `FUN`, evaluated on the `Grid` using the original data `X`. `band` an m by 2 matrix that stores the values of the lower limit of the confidence band (first column) and upper limit of the confidence band (second column), evaluated over the `Grid`.

## Author(s)

Jisu Kim and Fabrizio Lecci

## References

Wasserman L (2004). "All of statistics: a concise course in statistical inference." Springer.

Fasy BT, Lecci F, Rinaldo A, Wasserman L, Balakrishnan S, Singh A (2013). "Statistical Inference For Persistent Homology: Confidence Sets for Persistence Diagrams." (arXiv:1303.7117). Annals of Statistics.

Chazal F, Fasy BT, Lecci F, Michel B, Rinaldo A, Wasserman L (2014). "Robust Topological Inference: Distance-To-a-Measure and Kernel Distance." Technical Report.

`kde`, `dtm`

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18``` ```# Generate data from mixture of 2 normals. n <- 2000 X <- c(rnorm(n / 2), rnorm(n / 2, mean = 3, sd = 1.2)) # Construct a grid of points over which we evaluate the function by <- 0.02 Grid <- seq(-3, 6, by = by) ## bandwidth for kernel density estimator h <- 0.3 ## Bootstrap confidence band band <- bootstrapBand(X, kde, Grid, B = 80, parallel = FALSE, alpha = 0.05, h = h) plot(Grid, band[["fun"]], type = "l", lwd = 2, ylim = c(0, max(band[["band"]])), main = "kde with 0.95 confidence band") lines(Grid, pmax(band[["band"]][, 1], 0), col = 2, lwd = 2) lines(Grid, band[["band"]][, 2], col = 2, lwd = 2) ```

### Example output

```
```

TDA documentation built on March 30, 2021, 5:10 p.m.