id_significant: Identify Significant Features in Persistent Homology

Description Usage Arguments Examples

View source: R/inference.R

Description

An empirical method (bootstrap) to differentiate between features that constitute signal versus noise based on the magnitude of their persistence relative to one another. Note: you must have at least 5 features of a given dimension to use this function.

Usage

1
id_significant(features, dim = 1, reps = 100, cutoff = 0.975)

Arguments

features

3xn data frame of features; the first column must be dimension, the second birth, and the third death

dim

dimension of features of interest

reps

number of replicates

cutoff

percentile cutoff past which features are considered significant

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# get dataset (noisy circle) and calculate persistent homology
angles <- runif(100, 0, 2 * pi)
x <- cos(angles) + rnorm(100, mean = 0, sd = 0.1)
y <- sin(angles) + rnorm(100, mean = 0, sd = 0.1)
annulus <- cbind(x, y)
phom <- calculate_homology(annulus)

# find threshold of significance
# expecting 1 significant feature of dimension 1 (Betti-1 = 1 for annulus)
thresh <- id_significant(features = as.data.frame(phom),
                         dim = 1,
                         reps = 500,
                         cutoff = 0.975)

# generate flat persistence diagram
# every feature higher than `thresh` is significant
plot_persist(phom, flat = TRUE)

rrrlw/TDAstats documentation built on Nov. 24, 2021, 3:53 a.m.