blockfit_solve: Backfitting Solver for Blockfit Models
In lgspline: Lagrangian Multiplier Smoothing Splines for Smooth Function Estimation

blockfit_solve

R Documentation

Backfitting Solver for Blockfit Models

Description

Fits models with mixed spline and non-interactive linear ("flat") terms using an iterative backfitting approach. Flat terms receive a single shared coefficient across partitions (rather than K+1 partition-specific coefficients constrained to equality), reducing the effective parameter count and improving efficiency when the number of flat terms is large relative to spline terms.

The backfitting loop alternates between:

Spline step

Fit spline terms on the response adjusted for the current flat contribution using a constrained penalized least squares solve.

If \mathbf{y}_k is the response vector for partition k, \mathbf{Z}_k the spline design matrix, \mathbf{X}_{\mathrm{flat}}^{(k)} the flat design matrix, and \mathbf{v} the flat coefficient vector, then

\boldsymbol{\beta}_{\mathrm{spline}}^{(k)} = \arg\min_{\boldsymbol{\beta}} \left\| \mathbf{y}_k - \mathbf{Z}_k \boldsymbol{\beta} - \mathbf{X}_{\mathrm{flat}}^{(k)} \mathbf{v} \right\|^2 + \boldsymbol{\beta}^{\top} \mathbf{\Lambda}_{\mathrm{spline}} \boldsymbol{\beta}

subject to

\mathbf{A}_{\mathrm{spline}}^{\top} \boldsymbol{\beta} = \mathbf{c}_{\mathrm{spline}}.
Flat step

Update flat coefficients via pooled penalized regression on residuals, subject to the residual equality constraint from any mixed constraints:

\mathbf{v} = \arg\min_{\mathbf{v}} \sum_{k=0}^{K} \left\| \mathbf{y}_k - \mathbf{Z}_k \boldsymbol{\beta}_{\mathrm{spline}}^{(k)} - \mathbf{X}_{\mathrm{flat}}^{(k)} \mathbf{v} \right\|^2 + \mathbf{v}^{\top} \mathbf{\Lambda}_{\mathrm{flat}} \mathbf{v}

subject to

\tilde{\mathbf{A}}_{\mathrm{flat}}^{\top} \mathbf{v} = \mathbf{c} - \mathbf{A}_{\mathrm{spline}}^{\top} \boldsymbol{\beta}_{\mathrm{spline}}.

Four estimation paths are selected automatically based on the model configuration:

Case (a): Gaussian identity + GEE: closed-form solve via .bf_case_gauss_gee.
Case (b): Gaussian identity, no correlation: standard backfitting via .bf_case_gauss_no_corr.
Case (c): GLM + GEE: two-stage (damped Newton-Raphson warm start followed by active-set refinement on the whitened system, using the Woodbury path when available and dense SQP only as fallback) via .bf_case_glm_gee.
Case (d): GLM without GEE: damped Newton-Raphson + backfitting via .bf_case_glm_no_corr.

When quadprog = TRUE, inequality constraints are enforced after backfitting convergence through the same active-set-first strategy used in get_B(). The sparsity pattern of qp_Amat still decides whether the equality re-solve inside the active-set loop remains partition-wise or switches to the global bridge, but it no longer decides whether active-set is attempted at all. For GEE paths, the same logic is applied on the whitened system, using the Woodbury factorization when the low-rank gate succeeds. Dense SQP via .bf_sqp_loop is kept as fallback only when the active-set route is unavailable or does not converge.

After convergence, coefficients are reassembled into the standard per-partition format (flat coefficients replicated across partitions) for compatibility with downstream inference.

Usage

blockfit_solve(
  X,
  y,
  flat_cols,
  K,
  p_expansions,
  Lambda,
  L_partition_list,
  unique_penalty_per_partition,
  A,
  R_constraints,
  constraint_values,
  X_gram,
  Ghalf_full,
  GhalfInv_full,
  family,
  order_list,
  glm_weight_function,
  schur_correction_function,
  need_dispersion_for_estimation,
  dispersion_function,
  observation_weights,
  homogenous_weights = TRUE,
  iterate,
  tol,
  parallel_eigen,
  parallel_qr = FALSE,
  qr_pivot_smoothing_constraints = TRUE,
  cl,
  chunk_size,
  num_chunks,
  rem_chunks,
  return_G_getB,
  quadprog = FALSE,
  qp_Amat = NULL,
  qp_bvec = NULL,
  qp_meq = NULL,
  qp_score_function = NULL,
  keep_weighted_Lambda = FALSE,
  max_backfit_iter = 100,
  Vhalf = NULL,
  VhalfInv = NULL,
  initial_active_ineq = integer(0),
  include_warnings = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`X`	List of `K+1` design matrices (`N_k \times p_expansions`). Unwhitened even when GEE.
`y`	List of `K+1` response vectors. Unwhitened even when GEE.
`flat_cols`	Integer vector indicating flat columns of `\mathbf{X}^{(k)}`.
`K`	Integer; number of interior knots.
`p_expansions`	Integer; number of coefficients per partition.
`Lambda`	`p_expansions \times p_expansions` penalty matrix.
`L_partition_list`	List of partition-specific penalty matrices.
`unique_penalty_per_partition`	Logical.
`A`	Full `P \times R_constraints` constraint matrix.
`R_constraints`	Integer; number of columns of `\mathbf{A}`.
`constraint_values`	List of constraint right-hand sides.
`X_gram`	List of Gram matrices `\mathbf{X}_k^{\top} \mathbf{X}_k`.
`Ghalf_full`, `GhalfInv_full`	Lists of `\mathbf{G}^{1/2}` and `\mathbf{G}^{-1/2}` matrices.
`family`	GLM family object.
`order_list`	List of observation index vectors by partition.
`glm_weight_function`	GLM weight function.
`schur_correction_function`	Schur complement correction function.
`need_dispersion_for_estimation`	Logical.
`dispersion_function`	Dispersion estimation function.
`observation_weights`	List of observation weights.
`homogenous_weights`	Logical.
`iterate`	Logical; if FALSE, single pass (no iteration).
`tol`	Convergence tolerance.
`parallel_eigen`, `cl`, `chunk_size`, `num_chunks`, `rem_chunks`	Parallel arguments.
`qr_pivot_smoothing_constraints`	Logical. If `TRUE`, spline-only and flat-only equality blocks are rank-reduced by QR pivoting before the blockfit updates are run.
`return_G_getB`	Logical.
`quadprog`	Logical; apply inequality constraint refinement if TRUE.
`qp_Amat`	Inequality constraint matrix for `quadprog::solve.QP`.
`qp_bvec`	Inequality constraint right-hand side.
`qp_meq`	Number of leading equality constraints.
`qp_score_function`	Score function for QP subproblem.
`keep_weighted_Lambda`	Logical.
`max_backfit_iter`	Integer.
`Vhalf`	Square root of the working correlation matrix in the original observation ordering.
`VhalfInv`	Inverse square root of the working correlation matrix in the original observation ordering. When both are non-`NULL`, the GEE paths are used.
`initial_active_ineq`	Optional inequality columns used as the first active set in tuning or repeated solves.
`include_warnings`	Logical.
`verbose`	Logical.
`...`	Additional arguments passed to weight and dispersion functions.

Value

A list with elements:

B

List of K+1 coefficient vectors (p_expansions \times 1), flat coefficients replicated across partitions.

G_list

List containing \mathbf{G}, \mathbf{G}^{1/2}, and \mathbf{G}^{-1/2}, each a list of K+1 p_expansions \times p_expansions matrices, or NULL if return_G_getB is FALSE.

\mathbf{G} satisfies \mathbf{G} = \mathbf{G}^{1/2} (\mathbf{G}^{1/2})^{\top} exactly.

When a correlation structure is present, these matrices are obtained from the dense whitened information matrix and then split back into per-partition blocks. When flat columns are present, the returned matrices also reflect the pooled flat-column information across partitions.

qp_info

NULL when no inequality-refinement metadata are produced. Otherwise a list of available QP or active-set diagnostics. Dense SQP solves include solution, lagrangian, active_constraints, iact, Amat_active, bvec_active, meq_active, converged, and final_deviance; non-GEE dense solves also carry info_matrix, Amat_combined, bvec_combined, and meq_combined. Partition-wise active-set refinement returns the corresponding active-constraint summary only.

Examples

## Not run: 
## Minimal verification example: Gaussian identity, no correlation,
## 2 partitions, 3 spline columns + 1 flat column per partition.
##
## This confirms that blockfit_solve returns compatible output and
## that flat coefficients are replicated across partitions.

set.seed(1234)
n1 <- 50; n2 <- 50
p_expansions <- 4  # intercept + x + x^2 + z (flat)
K  <- 1  # 2 partitions

X1 <- cbind(1, rnorm(n1), rnorm(n1)^2, rnorm(n1))
X2 <- cbind(1, rnorm(n2), rnorm(n2)^2, rnorm(n2))
beta_true <- c(1, 0.5, -0.3, 0.8)
y1 <- X1 %*% beta_true + rnorm(n1, 0, 0.5)
y2 <- X2 %*% beta_true + rnorm(n2, 0, 0.5)

## Constraint: spline coefficients equal across partitions
## (columns 1:3 constrained, column 4 is flat)
A <- matrix(0, nrow = p_expansions * (K + 1), ncol = 3)
for(j in 1:3){
  A[j, j]      <-  1
  A[p_expansions + j, j]  <- -1
}
qr_A <- qr(A)
A <- qr.Q(qr_A)[, 1:qr_A$rank, drop = FALSE]

Lambda <- diag(p_expansions) * 1e-4
L_partition_list <- list(0, 0)
X_gram <- list(crossprod(X1), crossprod(X2))

G_list <- compute_G_eigen(
  X_gram, Lambda, K,
  parallel = FALSE, cl = NULL,
  chunk_size = NULL, num_chunks = NULL, rem_chunks = NULL,
  family = gaussian(),
  unique_penalty_per_partition = FALSE,
  L_partition_list = L_partition_list,
  keep_G = TRUE,
  schur_corrections = list(0, 0))

result <- blockfit_solve(
  X = list(X1, X2),
  y = list(y1, y2),
  flat_cols = 4L,
  K = K, p_expansions = p_expansions,
  Lambda = Lambda,
  L_partition_list = L_partition_list,
  unique_penalty_per_partition = FALSE,
  A = A, R_constraints = ncol(A),
  constraint_values = list(),
  X_gram = X_gram,
  Ghalf_full = G_list$Ghalf,
  GhalfInv_full = G_list$GhalfInv,
  family = gaussian(),
  order_list = list(1:n1, (n1+1):(n1+n2)),
  glm_weight_function = function(mu, y, oi, fam, d, ow, ...) rep(1, length(y)),
  schur_correction_function = function(X, y, B, d, ol, K, fam, ow, ...) {
    lapply(1:(K + 1), function(k) 0)
  },
  need_dispersion_for_estimation = FALSE,
  dispersion_function = function(...) 1,
  observation_weights = list(rep(1, n1), rep(1, n2)),
  homogenous_weights = TRUE,
  iterate = TRUE, tol = 1e-8,
  parallel = FALSE, cl = NULL,
  chunk_size = NULL, num_chunks = NULL, rem_chunks = NULL,
  return_G_getB = TRUE,
  verbose = FALSE)

## Verify: flat coefficient (col 4) is identical across partitions
stopifnot(abs(result$B[[1]][4] - result$B[[2]][4]) < 1e-10)

## Verify: output structure
stopifnot(length(result$B) == K + 1)
stopifnot(length(result$B[[1]]) == p_expansions)
stopifnot(!is.null(result$G_list))

## End(Not run)

lgspline documentation built on May 8, 2026, 5:07 p.m.