knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Overview

The betticurves packege implements betti curves from topological data analysis. It computes the curve based on a homology diagram. It accepts diagrams from TDA packege and TDAstat packege.

What is the objective behind the package?

My intention is to refactor all the code I used during my master thesis at CIMAT (Guanajuato, Mexico) and at the same time learn how to put it in a R package and upload it in GitHub. In my thesis I analyse how betti curve can be implemented in a data science framework. My work can be seen on This link. Unfortunatelly by the time this documentation is been written (jan 2019) my work is only available in spanish. I am working on a English draft. You can also contact me via Git Hub if you are more interested on my work.

How to install the package from github?

By now the package is not available in CRAN. So in order to use it in your R session you can install first the devtools packages (documentation here) and then use the following code:

# install package from GitHub
devtools::install_github("gonzalezgouveia/betticurves", force=TRUE, build_opts = c("--no-resave-data", "--no-manual"))

The github repo link here. This version may be unstable and it is under development.

How to use betticurves?

By now, the packege only have two functions for computations and one for visualization:

Example workflow

Let's take a look at an example of the basic workflow. I will asume you have already downloaded the package from GitHub. This example uses the function calculate_homology from the TDAstats package (documentation here). You can install this package directly from CRAN.

So first we load the packages in the R session

library("betticurves")
library("TDAstats")

In this example we will the data from a two dimentional uniform point process. So it will have 100 uniformly distributed random points for the first coordinate and the same for the second coordinate.

set.seed(12321)
data <- cbind(runif(100), runif(100))
plot(data, main='Data for the example')

Next, we calculate the homology diagram. Here I decided to use the TDAstat package because it is faster due to an implementation of Ripser algorithm, but it works fine with diagrams from TDA packages(ripsDiag or alphaComplexDiag), just take a look at the commented line to know how to extract the matrix from the 'diagram' object of TDA.

diag <- TDAstats::calculate_homology(data)            # for TDAstat package
# diag <- TDA::alphaComplexDiag(data)$diagram    # for TDA package
betti_curve <- betticurves::compute_betti_curves(diag)
print(summary(betti_curve))

The betti_curve tibble has three variables:

We can see the resulting curves with the function plot_betti_curve

plot_betti_curve(betti_curve) + ggplot2::ggtitle('Betti curves for data')

The tibble have two values in dimentions corresponding to the first and second homology dimention. The radius of the filtration of the complex varies from 0 to approximatelly 0.3. The value of the curve is reflected on the y-axis. As expected, the dimension zero (connected components) vanish to zero as the radius grows. The 1-dimension loops grows and then decay.

Contact me

Please use the issues of the repository or send me a private message with GitHub or LinkedIn https://www.linkedin.com/in/gonzalezgouveia/



gonzalezgouveia/CosmoBetti documentation built on May 29, 2019, 8 a.m.