f2fSP: Overlap Group Least Absolute Shrinkage and Selection Operator...

View source: R/f2freg.R

f2fSPR Documentation

Overlap Group Least Absolute Shrinkage and Selection Operator for function-on-function regression model

Description

Overlap Group-LASSO for function-on-function regression model solves the following optimization problem

\textrm{min}_{\psi} ~ \frac{1}{2} \sum_{i=1}^n \int \left( y_i(s) - \int x_i(t) \psi(t,s) dt \right)^2 ds + \lambda \sum_{g=1}^{G} \Vert S_{g}T\psi \Vert_2

to obtain a sparse coefficient vector \psi=\mathsf{vec}(\Psi)\in\mathbb{R}^{ML} for the functional penalized predictor x(t), where the coefficient matrix \Psi\in\mathbb{R}^{M\times L}, the regression function \psi(t,s)=\varphi(t)^\intercal\Psi\theta(s), \varphi(t) and \theta(s) are two B-splines bases of order d and dimension M and L, respectively. For each group g, each row of the matrix S_g\in\mathbb{R}^{d\times ML} has non-zero entries only for those bases belonging to that group. These values are provided by the arguments groups and group_weights (see below). Each basis function belongs to more than one group. The diagonal matrix T\in\mathbb{R}^{ML\times ML} contains the basis-specific weights. These values are provided by the argument var_weights (see below). The regularization path is computed for the overlap group-LASSO penalty at a grid of values for the regularization parameter \lambda using the alternating direction method of multipliers (ADMM). See Boyd et al. (2011) and Lin et al. (2022) for details on the ADMM method.

Usage

f2fSP(
  mY,
  mX,
  L,
  M,
  group_weights = NULL,
  var_weights = NULL,
  standardize.data = TRUE,
  splOrd = 4,
  lambda = NULL,
  lambda.min.ratio = NULL,
  nlambda = 30,
  overall.group = FALSE,
  control = list()
)

Arguments

mY

an (n\times r_y) matrix of observations of the functional response variable.

mX

an (n\times r_x) matrix of observations of the functional covariate.

L

number of elements of the B-spline basis vector \theta(s).

M

number of elements of the B-spline basis vector \varphi(t).

group_weights

a vector of length G containing group-specific weights. The default is square root of the group cardinality, see Bernardi et al. (2022).

var_weights

a vector of length ML containing basis-specific weights. The default is a vector where each entry is the reciprocal of the number of groups including that basis. See Bernardi et al. (2022) for details.

standardize.data

logical. Should data be standardized?

splOrd

the order d of the spline basis.

lambda

either a regularization parameter or a vector of regularization parameters. In this latter case the routine computes the whole path. If it is NULL values for lambda are provided by the routine.

lambda.min.ratio

smallest value for lambda, as a fraction of the maximum lambda value. If nr_y>LM, the default is 0.0001, and if nr_y<LM, the default is 0.01.

nlambda

the number of lambda values - default is 30.

overall.group

logical. If it is TRUE, an overall group including all penalized covariates is added.

control

a list of control parameters for the ADMM algorithm. See ‘Details’.

Value

A named list containing

sp.coefficients

an (M\times L) solution matrix for the parameters \Psi, which corresponds to the minimum in-sample MSE.

sp.coef.path

an (n_\lambda\times M \times L) array of estimated \Psi coefficients for each lambda.

sp.fun

an (r_x\times r_y) matrix providing the estimated functional coefficient for \psi(t,s).

sp.fun.path

an (n_\lambda\times r_x\times r_y) array providing the estimated functional coefficients for \psi(t,s) for each lambda.

lambda

sequence of lambda.

lambda.min

value of lambda that attains the minimum in-sample MSE.

mse

in-sample mean squared error.

min.mse

minimum value of the in-sample MSE for the sequence of lambda.

convergence

logical. 1 denotes achieved convergence.

elapsedTime

elapsed time in seconds.

iternum

number of iterations.

When you run the algorithm, output returns not only the solution, but also the iteration history recording following fields over iterates,

objval

objective function value.

r_norm

norm of primal residual.

s_norm

norm of dual residual.

eps_pri

feasibility tolerance for primal feasibility condition.

eps_dual

feasibility tolerance for dual feasibility condition.

Iteration stops when both r_norm and s_norm values become smaller than eps_pri and eps_dual, respectively.

Details

The control argument is a list that can supply any of the following components:

adaptation

logical. If it is TRUE, ADMM with adaptation is performed. The default value is TRUE. See Boyd et al. (2011) for details.

rho

an augmented Lagrangian parameter. The default value is 1.

tau.ada

an adaptation parameter greater than one. Only needed if adaptation = TRUE. The default value is 2. See Boyd et al. (2011) and Lin et al. (2022) for details.

mu.ada

an adaptation parameter greater than one. Only needed if adaptation = TRUE. The default value is 10. See Boyd et al. (2011) and Lin et al. (2022) for details.

abstol

absolute tolerance stopping criterion. The default value is sqrt(sqrt(.Machine$double.eps)).

reltol

relative tolerance stopping criterion. The default value is sqrt(.Machine$double.eps).

maxit

maximum number of iterations. The default value is 100.

print.out

logical. If it is TRUE, a message about the procedure is printed. The default value is TRUE.

References

\insertRef

bernardi_etal.2022fdaSP

\insertRef

boyd_etal.2011fdaSP

\insertRef

jenatton_etal.2011fdaSP

\insertRef

lin_etal.2022fdaSP

Examples


## generate sample data
set.seed(4321)
s  <- seq(0, 1, length.out = 100)
t  <- seq(0, 1, length.out = 100)
p1 <- 5
p2 <- 6
r  <- 10
n  <- 50

beta_basis1 <- splines::bs(s, df = p1, intercept = TRUE)    # first basis for beta
beta_basis2 <- splines::bs(s, df = p2, intercept = TRUE)    # second basis for beta

data_basis <- splines::bs(s, df = r, intercept = TRUE)    # basis for X

x_0   <- apply(matrix(rnorm(p1 * p2, sd = 1), p1, p2), 1, 
               fdaSP::softhresh, 1.5)  # regression coefficients 
x_fun <- beta_basis2 %*% x_0 %*%  t(beta_basis1)  

fun_data <- matrix(rnorm(n*r), n, r) %*% t(data_basis)
b        <- fun_data %*% x_fun + rnorm(n * 100, sd = sd(fun_data %*% x_fun )/3)

## set the hyper-parameters
maxit          <- 1000
rho_adaptation <- FALSE
rho            <- 1
reltol         <- 1e-5
abstol         <- 1e-5

## fit functional regression model
mod <- f2fSP(mY = b, mX = fun_data, L = p1, M = p2,
             group_weights = NULL, var_weights = NULL, standardize.data = FALSE, splOrd = 4,
             lambda = NULL, nlambda = 30, lambda.min.ratio = NULL, 
             control = list("abstol" = abstol, 
                            "reltol" = reltol, 
                            "maxit" = maxit, 
                            "adaptation" = rho_adaptation, 
                            rho = rho, 
             "print.out" = FALSE))
 
mycol <- function (n) {
palette <- colorRampPalette(RColorBrewer::brewer.pal(11, "Spectral"))
palette(n)
}
cols <- mycol(1000)

oldpar <- par(mfrow = c(1, 2))
image(x_0, col = cols)
image(mod$sp.coefficients, col = cols)
par(oldpar)

oldpar <- par(mfrow = c(1, 2))
image(x_fun, col = cols)
contour(x_fun, add = TRUE)
image(beta_basis2 %*% mod$sp.coefficients %*% t(beta_basis1), col = cols)
contour(beta_basis2 %*% mod$sp.coefficients %*% t(beta_basis1), add = TRUE)
par(oldpar)


fdaSP documentation built on Oct. 6, 2023, 1:07 a.m.