# bootstrapBand: Bootstrap Confidence Band In TDA: Statistical Tools for Topological Data Analysis

 bootstrapBand R Documentation

## Bootstrap Confidence Band

### Description

The function `bootstrapBand` computes a uniform symmetric confidence band around a function of the data `X`, evaluated on a `Grid`, using the bootstrap algorithm. See Details and References.

### Usage

```bootstrapBand(
X, FUN, Grid, B = 30, alpha = 0.05, parallel = FALSE,
printProgress = FALSE, weight = NULL, ...)
```

### Arguments

 `X` an n by d matrix of coordinates of points used by the function `FUN`, where n is the number of points and d is the dimension. `FUN` a function whose inputs are an n by d matrix of coordinates `X`, an m by d matrix of coordinates `Grid` and returns a numeric vector of length m. For example see `distFct`, `kde`, and `dtm` which compute the distance function, the kernel density estimator and the distance to measure over a grid of points, using the input `X`. `Grid` an m by d matrix of coordinates, where m is the number of points in the grid, at which `FUN` is evaluated. `B` the number of bootstrap iterations. `alpha` `bootstrapBand` returns a (`1-alpha`) confidence band. The default value is `0.05`. `parallel` logical: if `TRUE` the bootstrap iterations are parallelized, using the library `parallel`. The default value is `FALSE`. `printProgress` if `TRUE`, a progress bar is printed. The default value is `FALSE`. `weight` either NULL, a number, or a vector of length n. If it is NULL, weight is not used. If it is a number, then same weight is applied to each points of `X`. If it is a vector, `weight` represents weights of each points of `X`. The default value is `NULL`. `...` additional parameters for the function `FUN`.

### Details

First, the input function `FUN` is evaluated on the `Grid` using the original data `X`. Then, for `B` times, the bootstrap algorithm subsamples `n` points of `X` (with replacement), evaluates the function `FUN` on the `Grid` using the subsample, and computes the l_∞ distance between the original function and the bootstrapped one. The result is a sequence of `B` values. The (`1-alpha`) confidence band is constructed by taking the (`1-alpha`) quantile of these values.

### Value

The function `bootstrapBand` returns a list with the following elements:

 `width` number: (`1-alpha`) quantile of the values computed by the bootstrap algorithm. It corresponds to half of the width of the unfiorm confidence band; that is, `width` is the distance of the upper and lower limits of the band from the function evaluated using the original dataset `X`. `fun` a numeric vector of length m, storing the values of the input function `FUN`, evaluated on the `Grid` using the original data `X`. `band` an m by 2 matrix that stores the values of the lower limit of the confidence band (first column) and upper limit of the confidence band (second column), evaluated over the `Grid`.

### Author(s)

Jisu Kim and Fabrizio Lecci

### References

Wasserman L (2004). "All of statistics: a concise course in statistical inference." Springer.

Fasy BT, Lecci F, Rinaldo A, Wasserman L, Balakrishnan S, Singh A (2013). "Statistical Inference For Persistent Homology: Confidence Sets for Persistence Diagrams." (arXiv:1303.7117). Annals of Statistics.

Chazal F, Fasy BT, Lecci F, Michel B, Rinaldo A, Wasserman L (2014). "Robust Topological Inference: Distance-To-a-Measure and Kernel Distance." Technical Report.

`kde`, `dtm`

### Examples

```# Generate data from mixture of 2 normals.
n <- 2000
X <- c(rnorm(n / 2), rnorm(n / 2, mean = 3, sd = 1.2))

# Construct a grid of points over which we evaluate the function
by <- 0.02
Grid <- seq(-3, 6, by = by)

## bandwidth for kernel density estimator
h <- 0.3
## Bootstrap confidence band
band <- bootstrapBand(X, kde, Grid, B = 80, parallel = FALSE, alpha = 0.05,
h = h)

plot(Grid, band[["fun"]], type = "l", lwd = 2,
ylim = c(0, max(band[["band"]])), main = "kde with 0.95 confidence band")
lines(Grid, pmax(band[["band"]][, 1], 0), col = 2, lwd = 2)
lines(Grid, band[["band"]][, 2], col = 2, lwd = 2)
```

TDA documentation built on Feb. 16, 2023, 6:35 p.m.