L0TFinv-package: A package for L0-regularized sparse approximation
In L0TFinv: A Splicing Approach to the Inverse Problem of L0 Trend Filtering

L0TFinv-package

R Documentation

A package for L0-regularized sparse approximation

Description

Trend filtering is a typical method for nonparametric regression. The commonly used trend filtering models is the L1 trend filtering model (a) based on the difference matrix \boldsymbol{D}^{(q+1)}, as illustrated below.

\min _{\boldsymbol{\alpha} \in \mathbb{R}^n} \frac{1}{2}\|\boldsymbol{y}-\boldsymbol{\alpha}\|_2^2 + \lambda\|\boldsymbol{D}^{(q+1)} \boldsymbol{\alpha}\|_{\ell_1}, \quad q=0,1,2, \ldots. \quad (a)

L0 trend filtering (b) has a advantage over other trend filtering methods, especially in the detection of change points. The expression for L0 trend filtering is as follows:

\min _{\boldsymbol{\alpha} \in \mathbb{R}^n} \frac{1}{2}\|\boldsymbol{y}-\boldsymbol{\alpha}\|_2^2 + \lambda\|\boldsymbol{D}^{(q+1)} \boldsymbol{\alpha}\|_{\ell_0}. \quad (b)

We explore transforming the problem (b) into a L0-regularized sparse format (c) by introducing an artificial design matrix \boldsymbol{X}^{(q+1)} that corresponds to the difference matrix, thereby reformulating the L0 trend filtering problem into the following format.

\min _{\boldsymbol{\beta} \in \mathbb{R}^n} \frac{1}{2}\|\boldsymbol{y}-\boldsymbol{X}^{(q+1)}\boldsymbol{\beta}\|_2^2 + \lambda \sum_{i=q+2}^n |\boldsymbol{\beta}_i|_{\ell_0}. \quad (c)

In our practical approach, we consider the maximum number of change points k_{\text{max}} as a constraint, transforming the aforementioned L0 penalty problem (c) into the following L0 constraint problem.

\text{ minimize }\frac{1}{2}\|\boldsymbol{y}-\boldsymbol{X}^{(q+1)}\boldsymbol{\beta}\|_2^2,\quad \text{ subject to } \sum_{i=q+2}^n |\boldsymbol{\beta}_i|_{\ell_0} \leq k_{\text{max}}. \quad (d)

For such L0 constraint problems (d), we employ a splicing-based approach to design algorithms for processing. This package has the following seven main methods:

matrix with special structure: \quadGenerate \boldsymbol{X}^{(q+1)} or \boldsymbol{D}^{(q+1)} matrix.
inverse of the crossprod matrix: \quadSimplify the calculation of the inverse matrix of (\boldsymbol{X}^{(q+1)}_A)^T \boldsymbol{X}^{(q+1)}_A for the cases where q=0 or q=1, which is frequently used in splicing algorithms.
inverse L0 trend filtering with fixed change points: \quadFit a piecewise constant or piecewise linear estimated trend with a given number of change points.
inverse L0 trend filtering with optimal change points: \quadFit a piecewise constant or piecewise linear estimated trend with a maximum number of change points, and select the optimal estimated trend using appropriate information criteria.
simulated data: \quadGenerate piecewise constant or piecewise linear data.
print/coef: \quadPrint a summary of the trend estimation results.
plot: \quadPlot a summary of the trend estimation results.

Details

_PACKAGE

In previous studies, algorithms solving trend filtering problems (a) necessitate the computation of ((\boldsymbol{D}^{(q+1)})^T \boldsymbol{D}^{(q+1)})^{-1}. When n is large, just fitting the matrix into memory becomes an issue.
In L0 trend filtering (b), the positions of non-zero elements in the L0 norm correspond with the locations of change points. We consider two subsets: the active set A for non-zero elements and the inactive set I for zero elements. Despite this, computing ((\boldsymbol{D}^{(q+1)}_I)^T \boldsymbol{D}^{(q+1)}_I)^{-1} remains a task involving a substantial matrix.
Due to the connection between L0 constraint problems and L0 penalty problems, and considering that the sparsity of \boldsymbol{\beta} is is more meaningful in practical applications than the selection of the hyperparameter \lambda. We focus on the constraint that reflects our aim to achieve an estimated trend with a given number of change points. So we transform the L0 penalty problem (c) into the L0 constraint problem (d).