knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The betticurves packege implements betti curves from topological data analysis. It computes the curve based on a homology diagram. It accepts diagrams from TDA packege and TDAstat packege.
My intention is to refactor all the code I used during my master thesis at CIMAT (Guanajuato, Mexico) and at the same time learn how to put it in a R package and upload it in GitHub. In my thesis I analyse how betti curve can be implemented in a data science framework. My work can be seen on This link. Unfortunatelly by the time this documentation is been written (jan 2019) my work is only available in spanish. I am working on a English draft. You can also contact me via Git Hub if you are more interested on my work.
By now the package is not available in CRAN. So in order to use it in your R session you can install first the devtools
packages (documentation here) and then use the following code:
# install package from GitHub devtools::install_github("gonzalezgouveia/betticurves", force=TRUE, build_opts = c("--no-resave-data", "--no-manual"))
The github repo link here. This version may be unstable and it is under development.
By now, the packege only have two functions for computations and one for visualization:
compute_betti_number
is for Betti number of a persistence diagram for a desired dimension and radius.compute_betti_curve
Compute Betti curves of a persistence diagram until a desired dimension, a maximum radius, and output vector lenght. So betti_number is been used inside betti_curve function. compute_betti_curve
will return a tibble which is ready to be plotted using the plot_betti_curve
function. Let's take a look at an example of the basic workflow. I will asume you have already downloaded the package from GitHub. This example uses the function calculate_homology
from the TDAstats package (documentation here). You can install this package directly from CRAN.
So first we load the packages in the R session
library("betticurves") library("TDAstats")
In this example we will the data from a two dimentional uniform point process. So it will have 100 uniformly distributed random points for the first coordinate and the same for the second coordinate.
set.seed(12321) data <- cbind(runif(100), runif(100)) plot(data, main='Data for the example')
Next, we calculate the homology diagram. Here I decided to use the TDAstat
package because it is faster due to an implementation of Ripser algorithm, but it works fine with diagrams from TDA
packages(ripsDiag
or alphaComplexDiag
), just take a look at the commented line to know how to extract the matrix from the 'diagram' object of TDA.
diag <- TDAstats::calculate_homology(data) # for TDAstat package # diag <- TDA::alphaComplexDiag(data)$diagram # for TDA package betti_curve <- betticurves::compute_betti_curves(diag) print(summary(betti_curve))
The betti_curve tibble has three variables:
dim
: the dimensions of the homology. Here only 0 and 1.radius
: the radius of the complex filtration. Remember points are generated in the interval [0,1].value
: the betti number for a radius. We have circa 100 points.We can see the resulting curves with the function plot_betti_curve
plot_betti_curve(betti_curve) + ggplot2::ggtitle('Betti curves for data')
The tibble have two values in dimentions corresponding to the first and second homology dimention. The radius of the filtration of the complex varies from 0 to approximatelly 0.3. The value of the curve is reflected on the y-axis. As expected, the dimension zero (connected components) vanish to zero as the radius grows. The 1-dimension loops grows and then decay.
Please use the issues of the repository or send me a private message with GitHub or LinkedIn https://www.linkedin.com/in/gonzalezgouveia/
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.