computeStats | R Documentation |
For a given persistence diagram D=\{(b_i,d_i)\}_{i=1}^N
(corresponding to a specified homological dimension), computeStats()
calculates descriptive statistics of the birth, death, midpoint (the average of birth and death), and lifespan (death minus birth) values. Additionally, it computes the total number of points and entropy of the lifespan values. Points in D
with infinite death values are ignored.
computeStats(D, homDim)
D |
a persistence diagram: a matrix with three columns containing the homological dimension, birth and death values respectively. |
homDim |
the homological dimension (0 for |
The function extracts rows from D
where the first column equals homDim
, and computes the mean, standard deviation, median, IQR (interquartile range), range, 10th, 25th, 75th and 90th percentiles of the birth, death, midpoint, lifespan (or persistence) values; the total number of bars (or points in the diagram) and the entropy of the lifespan values (-\sum_{i=1}^N\frac{l_i}{L}\log_2(\frac{l_i}{L})
, where l_i=d_i-b_i
(lifespan) and L=\sum_{i=1}^N l_i
). If D
does not contain any points corresponding to homDim
, a vector of zeros is returned.
A (named) 38-dimensional numeric vector containing:
mean_births
, stddev_births
, median_births
, iqr_births
, range_births
, p10_births
, p25_births
, p75_births
, p90_births
: Descriptive statistics for birth values.
mean_deaths
, stddev_deaths
, median_deaths
, iqr_deaths
, range_deaths
, p10_deaths
, p25_deaths
, p75_deaths
, p90_deaths
: Descriptive statistics for death values.
mean_midpoints
, stddev_midpoints
, median_midpoints
, iqr_midpoints
, range_midpoints
, p10_midpoints
, p25_midpoints
, p75_midpoints
, p90_midpoints
: Descriptive statistics for midpoint values (mean of birth and death values).
mean_lifespans
, stddev_lifespans
, median_lifespans
, iqr_lifespans
, range_lifespans
, p10_lifespans
, p25_lifespans
, p75_lifespans
, p90_lifespans
: Descriptive statistics for lifespan (or persistence) values (difference between death and birth values).
total_bars
: The total number of points in the specified homological dimension.
entropy
: The entropy of the lifespan values.
Umar Islambekov
1. Ali, D., Asaad, A., Jimenez, M.J., Nanda, V., Paluzo-Hidalgo, E. and Soriano-Trigueros, M., (2023). A survey of vectorization methods in topological data analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence.
N <- 100 # The number of points to sample
set.seed(123) # Set a random seed for reproducibility
# Sample N points uniformly from the unit circle and add Gaussian noise
theta <- runif(N, min = 0, max = 2 * pi)
X <- cbind(cos(theta), sin(theta)) + rnorm(2 * N, mean = 0, sd = 0.2)
# Compute the persistence diagram using the Rips filtration built on top of X
# The 'threshold' parameter specifies the maximum distance for building simplices
D <- TDAstats::calculate_homology(X, threshold = 2)
# Compute statistics for homological dimension H_0
computeStats(D, homDim = 0)
# Compute statistics for homological dimension H_1
computeStats(D, homDim = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.