library(gasper)
if (as.numeric(R.version$minor)>6){
  RNGkind(sample.kind = "Rounding")
}
set.seed(434343)
data(grid1)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width=3,
  fig.height=3, 
  fig.align="center"
)

\begin{abstract} We present a short tutorial on to the use of the \proglang{R} \pkg{gasper} package. Gasper is a package dedicated to signal processing on graphs. It also provides an interface to the SuiteSparse Matrix Collection. \end{abstract}

Introduction

The emerging field of Graph Signal Processing (GSP) aims to bridge the gap between signal processing and spectral graph theory. One of the objectives is to generalize fundamental analysis operations from regular grid signals to irregular structures in the form of graphs. There is an abundant literature on GSP, in particular we refer the reader to @shuman2013emerging and @ortega2018graph for an introduction to this field and an overview of recent developments, challenges and applications. GSP has also given rise to numerous applications in machine/deep learning: convolutional neural networks (CNN) on graphs @bruna2013spectral, @henaff2015deep, @defferrard2016convolutional, semi-supervised classification with graph CNN @kipf2016semi, @hamilton2017inductive, community detection @tremblay2014graph, to name just a few.

Different software programs exist for processing signals on graphs, in different languages. The Graph Signal Processing toolbox (GSPbox) is an easy to use matlab toolbox that performs a wide variety of operations on graphs. This toolbox was port to Python as the PyGSP @perraudin2014gspbox. There is also another matlab toolbox the Spectral Graph Wavelet Transform (SGWT) toolbox dedicated to the implementation of the SGWT developed in @hammond2011wavelets. However, to our knowledge, there are not yet any tools dedicated to GSP in \proglang{R}. A first version of the \pkg{gasper} package is available online\footnote{https://github.com/fabnavarro/gasper}. In particular, it includes the methodology and codes\footnote{https://github.com/fabnavarro/SGWT-SURE} developed in @de2019data and provides an interface to the SuiteSparse Matrix Collection @davis2011university.

Graphs Collection and Visualization

A certain number of graphs are present in the package. They are stored as an Rdata file which contains a list consisting of the graph's weight matrix $W$ (in the form of a sparse matrix denoted by sA) and the coordinates associated with the graph (if it has any).

An interface is also provided. It allows to retrieve the matrices related to many problems provided by the SuiteSparse Matrix Collection (formerly known as the University of Florida Sparse Matrix Collection) @davis2011university, @kolodziej19. This collection is a large and actively growing set of sparse matrices that arise in real applications (as structural engineering, computational fluid dynamics, computer graphics/vision, optimization, economic and financial modeling, mathematics and statistics, to name just a few). For more details see https://sparse.tamu.edu/.

The download_graph function allows to download a graph from this collection, based on the name of the graph and the name of the group that provides it. An example is given below

matrixname <- "grid1"
groupname <- "AG-Monien"
download_graph(matrixname, groupname)
attributes(grid1)

The output is stored (in a temporary folder) as a list composed of:

str(grid1$sA)
head(grid1$xy, 3)
grid1$dim
list.files(grid1$temp)

Information about the matrix that can be display via file.show(paste(grid1$temp,"grid1",sep="")) for example or in the console:

cat(readLines(paste(grid1$temp,"grid1",sep=""), n=14), sep = "\n")

download_graph function has an optional svd argument; setting "svd = TRUE" downloads a ".mat" file containing the singular values of the matrix, if available.

For further insights, the get_graph_info function retrieve detailed information about the matrix from the SuiteSparse Matrix Collection website. get_graph_info fetches the three tables with "Matrix Information", "Matrix Properties," and "SVD Statistics", providing a comprehensive overview of the matrix (rvest package needs to be installed).

matrix_info <- get_graph_info(matrixname, groupname)
matrix_info
graph_info <- get_graph_info(matrixname, groupname)
knitr::kables(list(knitr::kable(graph_info[[2]], valign = 't'),
                   knitr::kable(graph_info[[3]], valign = 't')))

The package also allows to plot a (planar) graph using the function plot_graph. It also contains a function to plot signals defined on top of the graph plot_signal.

f <- rnorm(nrow(grid1$sA))
plot_graph(grid1)
plot_signal(grid1, f, size = 2)

Example of application to denoising

We give an example of an application in the case of the denoising of a noisy signal $f$ defined on a graph $G$ with set of vertices $V$. More precisely, the (unnormalized) graph Laplacian matrix $\La\in\R^{V\times V}$ associated with $G$ is the symmetric matrix defined as $\La=\D - \W$, where $\W$ is the matrix of weights with coefficients $(w_{ij}){i,j\in V}$, and $\D$ the diagonal matrix with diagonal coefficients $\D{ii}= \sum_{j\in V} w_{ij}$. A signal $f$ on the graph $G$ is a function $f:V\rightarrow \R$.

The degradation model can be written as [ \tilde f = f + \xi, ] where $\xi\sim\mathcal{N}(0,\sigma^2)$. The purpose of denoising is to build an estimator $\hat f$ of $f$ that depends only on $\tilde f$.

A simple way to construct an effective non-linear estimator is obtained by thresholding the SGWT coefficients of $f$ on a frame (see @hammond2011wavelets for details about the SGWT).

A general thresholding operator $\tau$ with threshold parameter $t\geq 0$ applied to some signal $f$ is defined as \begin{equation}\label{eq:tau} \tau(x,t)=x\max { 1-t^{\beta}|x|^{-\beta},0 }, \end{equation} with $\beta \geq 1$. The most popular choices are the soft thresholding ($\beta=1$), the James-Stein thresholding ($\beta=2$) and the hard thresholding ($\beta=\infty$).

Given the Laplacian and a given frame, denoising in this framework can be summarized as follows:

Each of these steps can be performed via one of the functions analysis, synthesis, beta_thresh. Laplacian is given by the function laplacian_mat. The tight_frame function allows the construction of a tight frame based on @gobel2018construction and @coulhon2012heat. In order to select a threshold value, we consider the method developed in @de2019data which consists in determining the threshold that minimizes the Stein unbiased risk estimator (SURE) in a graph setting (see @de2019data for more details).

We give an illustrative example on the grid1 graph from the previous section. We start by calculating, the Laplacian matrix (from the adjacency matrix), its eigendecomposition and the frame coefficients.

A <- grid1$sA
L <- laplacian_mat(A)
val1 <- eigensort(L)
evalues <- val1$evalues
evectors <- val1$evectors
#- largest eigenvalue
lmax <- max(evalues)
#- parameter that controls the scale number
b <- 2
tf <- tight_frame(evalues, evectors, b=b)

Wavelet frames can be seen as special filter banks. The tight-frame considered here is a finite collection $(\psi_j)_{j=0, \ldots,J}$ forming a finite partition of unity on the compact $[0,\lambda_1]$, where $\lambda_1$ is the largest eigenvalue of the Laplacian spectrum $\mathrm{sp}(\La)$. This partition is defined as follows: let $\omega : \mathbb R^+ \rightarrow [0,1]$ be some function with support in $[0,1]$, satisfying $\omega \equiv 1$ on $[0,b^{-1}]$, for some $b>1$, and set \begin{equation} \psi_0(x)=\omega(x)~~\textrm{and}~~\psi_j(x)=\omega(b^{-j}x)-\omega(b^{-j+1}x)~~\textrm{for}~~j=1, \ldots, J,~~\textrm{where}~~J= \left \lfloor \frac{\log \lambda_1}{\log b} \right \rfloor + 2. \end{equation} Thanks to Parseval's identity, the following set of vectors is a tight frame: [ \mathfrak F = \left { \sqrt{\psi_j}(\La)\delta_i, j=0, \ldots, J, i \in V \right }. ] The plot_filter function allows to represent the elements (filters) of this partition.

plot_filter(lmax,b)

The SGWT of a signal $f \in \mathbb R^V$ is given by [ \WT f = \left ( \sqrt{\psi_0}(\La)f^{T},\ldots,\sqrt{\psi_J}(\La)f^{T} \right )^{T} \in \mathbb R^{n(J+1)}. ] The adjoint linear transformation $\WT^\ast$ of $\WT$ is: [ \WT^\ast \left (\eta_{0}^{T}, \eta_{1}^{T}, \ldots, \eta_{J}^T \right )^{T} = \sum_{j\geq 0} \sqrt{\psi_j}(\La)\eta_{j}. ] The tightness of the underlying frame implies that $\WT^\ast \WT=\mathrm{Id}{\mathbb R^V}$ so that a signal $f \in \mathbb R^V$ can be recovered by applying $\WT^\ast$ to its wavelet coefficients $((\WT f)_i){i=1, \ldots, n(J+1)} \in \mathbb R^{n(J+1)}$.

Then, noisy observations $\tilde f$ are generated from a random signal $f$.

n <- nrow(L)
f <- randsignal(0.01, 3, A)
sigma <- 0.01
noise <- rnorm(n, sd = sigma)
tilde_f <- f + noise

Below is a graphical representation of the original signal and its noisy version.

plot_signal(grid1, f, size = 2)
plot_signal(grid1, tilde_f, size = 2)

We compute the SGWT transforms $\WT \tilde f$ and $\WT f$.

wcn <- analysis(tilde_f,tf)
wcf <- analysis(f,tf)

An alternative to avoid frame calculation is given by the forward_sgwt function which provides a fast forward SGWT. For example:

wcf <- forward_sgwt(f, evalues, evectors, b=b)

The optimal threshold is then determined by minimizing the SURE (using Donoho and Johnstone's trick @donoho1995adapting which remains valid here, see @de2019data). More precisely, the SURE for a general thresholding process $h$ is given by the following identity
\begin{equation} \mathbf{SURE}(h)=-n \sigma^2 + \|h(\widetilde F)-\widetilde F\|^2 + 2 \sum_{i,j=1}^{n(J+1)} \gamma_{i,j}^2 \partial_j h_i(\widetilde F), \end{equation} where $\gamma_{i,j}^2=\sigma^2(\WT \WT ^\ast)_{i,j}$ that can be computed from the frame (or estimated via Monte-Carlo simulation). The SURE_thresh/SURE_MSEthreshallow to evaluate the SURE (in a global fashion) considering the general thresholding operator $\tau$ \eqref{eq:tau}. These functions provide two different ways of applying the threshold, "uniform" and "dependent" (\emph{i.e.}, the same threshold for each coefficient vs a threshold normalized by the variance of each coefficient). The second approach generally provides better results (especially when the weights have been calculated via the frame). A comparative example of these two approaches is given below (with $\beta=2$ James-Stein attenuation threshold).

diagWWt <- colSums(t(tf)^2)
thresh <- sort(abs(wcn))
opt_thresh_d <- SURE_MSEthresh(wcn, 
                           wcf, 
                           thresh, 
                           diagWWt, 
                           beta=2, 
                           sigma, 
                           NA,
                           policy = "dependent",
                           keepwc = TRUE)

opt_thresh_u <- SURE_MSEthresh(wcn, 
                           wcf, 
                           thresh, 
                           diagWWt, 
                           beta=2, 
                           sigma, 
                           NA,
                           policy = "uniform",
                           keepwc = TRUE)

We can plot MSE risks and their SUREs estimates as a function of the threshold parameter (assuming that $\sigma$ is known).

plot(thresh, opt_thresh_u$res$MSE,
     type="l", xlab = "t", ylab = "risk", log="x")
lines(thresh, opt_thresh_u$res$SURE-n*sigma^2, col="red")
lines(thresh, opt_thresh_d$res$MSE, lty=2)
lines(thresh, opt_thresh_d$res$SURE-n*sigma^2, col="red", lty=2)
legend("topleft", legend=c("MSE_u", "SURE_u",
                           "MSE_d", "SURE_d"),
       col=rep(c("black", "red"), 2), 
       lty=c(1,1,2,2), cex = 1)

Finally, the synthesis allows us to determine the resulting estimators of $f$, \emph{i.e.}, the ones that minimize the unknown MSE risks and the ones that minimizes the SUREs.

wc_oracle_u <- opt_thresh_u$wc[, opt_thresh_u$min["xminMSE"]]
wc_oracle_d <- opt_thresh_d$wc[, opt_thresh_d$min["xminMSE"]]
wc_SURE_u <- opt_thresh_u$wc[, opt_thresh_u$min["xminSURE"]]
wc_SURE_d <- opt_thresh_d$wc[, opt_thresh_d$min["xminSURE"]]

hatf_oracle_u <- synthesis(wc_oracle_u, tf)
hatf_oracle_d <- synthesis(wc_oracle_d, tf)
hatf_SURE_u  <- synthesis(wc_SURE_u, tf)
hatf_SURE_d  <- synthesis(wc_SURE_d, tf)

res <- data.frame("Input_SNR"=round(SNR(f,tilde_f),2),
                  "MSE_u"=round(SNR(f,hatf_oracle_u),2),
                  "SURE_u"=round(SNR(f,hatf_SURE_u),2),
                  "MSE_d"=round(SNR(f,hatf_oracle_d),2),
                  "SURE_d"=round(SNR(f,hatf_SURE_d),2))
knitr::kable(res, caption="Uniform vs Dependent" )

It can be seen from Table 1 that in both cases, SURE provides a good estimator of the MSE and therefore the resulting estimators have performances close (in terms of SNR) to those obtained by minimizing the unknown risk.

Equivalently, estimators can be obtained by the inverse of the SGWT given by the function inverse_sgwt. For exemple:

hatf_oracle_u <- inverse_sgwt(wc_oracle_u,
                              evalues, evectors, b)

Or if the coefficients have not been stored for each threshold value (\emph{i.e.}, with the argument "keepwc=FALSE" when calling SUREthresh) using the thresholding function beta_thresh, \emph{e.g.},

wc_oracle_u <- betathresh(wcn, 
                          thresh[opt_thresh_u$min[[1]]], 2)

Notably, SURE can also be applied in a level-dependent manner using SUREthresh at each scale (the output of SUREthresh can be retrieve with the argument "keepSURE = TRUE" at the function call).

J <- floor(log(lmax)/log(b)) + 2
LD_opt_thresh_d <- LD_SUREthresh(J=J, 
                             wcn=wcn, 
                             diagWWt=diagWWt, 
                             beta=2, 
                             sigma=sigma,
                             hatsigma=NA,
                             policy = "uniform",
                             keepSURE = FALSE)
hatf_LD_SURE_d <- synthesis(LD_opt_thresh_d$wcLDSURE, tf)
print(paste0("LD_SURE_u = ",round(SNR(f,hatf_LD_SURE_d),2),"dB"))

Even though the SURE no longer depends on the original signal, it does depend on $\sigma^2$, two naive (biased) estimators are obtained via GVN or HPVN functions (see @de2019data for more details). Another possible improvement would be to use a scale-dependent variance estimator (especially in the case of "policy = "dependent"").

Furthermore, the major limitations are the need to diagonalize the graph's Laplacian, and the calculation of the weights involved in the SURE (which requires an explicit calculation of the frame). To address the first limitation, several strategies have been proposed in the literature, notably via approximation by Chebyshev polynomials (see @hammond2011wavelets or @shuman18). Combined with these approximations, a Monte Carlo method to estimate the SURE weights has been proposed in @chedemail22, extending the applicability of SURE to large graphs.

Bibliography



Try the gasper package in your browser

Any scripts or data that you put into this service are public.

gasper documentation built on Oct. 27, 2023, 1:07 a.m.