View source: R/blockfit_solve.R
| blockfit_solve | R Documentation |
Fits models with mixed spline and non-interactive linear ("flat") terms
using an iterative backfitting approach. Flat terms receive a single
shared coefficient across partitions (rather than K+1
partition-specific coefficients constrained to equality), reducing the
effective parameter count and improving efficiency when the number of
flat terms is large relative to spline terms.
The backfitting loop alternates between:
Spline step
Fit spline terms on the response adjusted for the current flat contribution using a constrained penalized least squares solve.
If \mathbf{y}_k is the response vector for partition k,
\mathbf{Z}_k the spline design matrix,
\mathbf{X}_{\mathrm{flat}}^{(k)} the flat design matrix,
and \mathbf{v} the flat coefficient vector, then
\boldsymbol{\beta}_{\mathrm{spline}}^{(k)} =
\arg\min_{\boldsymbol{\beta}}
\left\|
\mathbf{y}_k
- \mathbf{Z}_k \boldsymbol{\beta}
- \mathbf{X}_{\mathrm{flat}}^{(k)} \mathbf{v}
\right\|^2
+
\boldsymbol{\beta}^{\top}
\mathbf{\Lambda}_{\mathrm{spline}}
\boldsymbol{\beta}
subject to
\mathbf{A}_{\mathrm{spline}}^{\top}
\boldsymbol{\beta}
=
\mathbf{c}_{\mathrm{spline}}.
Flat step
Update flat coefficients via pooled penalized regression on residuals, subject to the residual equality constraint from any mixed constraints:
\mathbf{v}
=
\arg\min_{\mathbf{v}}
\sum_{k=0}^{K}
\left\|
\mathbf{y}_k
-
\mathbf{Z}_k
\boldsymbol{\beta}_{\mathrm{spline}}^{(k)}
-
\mathbf{X}_{\mathrm{flat}}^{(k)}
\mathbf{v}
\right\|^2
+
\mathbf{v}^{\top}
\mathbf{\Lambda}_{\mathrm{flat}}
\mathbf{v}
subject to
\tilde{\mathbf{A}}_{\mathrm{flat}}^{\top}
\mathbf{v}
=
\mathbf{c}
-
\mathbf{A}_{\mathrm{spline}}^{\top}
\boldsymbol{\beta}_{\mathrm{spline}}.
Four estimation paths are selected automatically based on the model configuration:
Gaussian identity + GEE: closed-form solve via
.bf_case_gauss_gee.
Gaussian identity, no correlation: standard
backfitting via .bf_case_gauss_no_corr.
GLM + GEE: two-stage (damped Newton-Raphson warm start
followed by active-set refinement on the whitened system, using the
Woodbury path when available and dense SQP only as fallback) via
.bf_case_glm_gee.
GLM without GEE: damped Newton-Raphson + backfitting via
.bf_case_glm_no_corr.
When quadprog = TRUE, inequality constraints are enforced after
backfitting convergence through the same active-set-first strategy used
in get_B(). The sparsity pattern of qp_Amat still decides
whether the equality re-solve inside the active-set loop remains
partition-wise or switches to the global bridge, but it no longer
decides whether active-set is attempted at all. For GEE paths, the same
logic is applied on the whitened system, using the Woodbury
factorization when the low-rank gate succeeds. Dense SQP via
.bf_sqp_loop is kept as fallback only when the active-set route
is unavailable or does not converge.
After convergence, coefficients are reassembled into the standard per-partition format (flat coefficients replicated across partitions) for compatibility with downstream inference.
blockfit_solve(
X,
y,
flat_cols,
K,
p_expansions,
Lambda,
L_partition_list,
unique_penalty_per_partition,
A,
R_constraints,
constraint_values,
X_gram,
Ghalf_full,
GhalfInv_full,
family,
order_list,
glm_weight_function,
schur_correction_function,
need_dispersion_for_estimation,
dispersion_function,
observation_weights,
homogenous_weights = TRUE,
iterate,
tol,
parallel_eigen,
parallel_qr = FALSE,
qr_pivot_smoothing_constraints = TRUE,
cl,
chunk_size,
num_chunks,
rem_chunks,
return_G_getB,
quadprog = FALSE,
qp_Amat = NULL,
qp_bvec = NULL,
qp_meq = NULL,
qp_score_function = NULL,
keep_weighted_Lambda = FALSE,
max_backfit_iter = 100,
Vhalf = NULL,
VhalfInv = NULL,
initial_active_ineq = integer(0),
include_warnings = TRUE,
verbose = FALSE,
...
)
X |
List of |
y |
List of |
flat_cols |
Integer vector indicating flat columns of
|
K |
Integer; number of interior knots. |
p_expansions |
Integer; number of coefficients per partition. |
Lambda |
|
L_partition_list |
List of partition-specific penalty matrices. |
unique_penalty_per_partition |
Logical. |
A |
Full |
R_constraints |
Integer; number of columns of |
constraint_values |
List of constraint right-hand sides. |
X_gram |
List of Gram matrices
|
Ghalf_full, GhalfInv_full |
Lists of
|
family |
GLM family object. |
order_list |
List of observation index vectors by partition. |
glm_weight_function |
GLM weight function. |
schur_correction_function |
Schur complement correction function. |
need_dispersion_for_estimation |
Logical. |
dispersion_function |
Dispersion estimation function. |
observation_weights |
List of observation weights. |
homogenous_weights |
Logical. |
iterate |
Logical; if FALSE, single pass (no iteration). |
tol |
Convergence tolerance. |
parallel_eigen, cl, chunk_size, num_chunks, rem_chunks |
Parallel arguments. |
qr_pivot_smoothing_constraints |
Logical. If |
return_G_getB |
Logical. |
quadprog |
Logical; apply inequality constraint refinement if TRUE. |
qp_Amat |
Inequality constraint matrix for
|
qp_bvec |
Inequality constraint right-hand side. |
qp_meq |
Number of leading equality constraints. |
qp_score_function |
Score function for QP subproblem. |
keep_weighted_Lambda |
Logical. |
max_backfit_iter |
Integer. |
Vhalf |
Square root of the working correlation matrix in the original observation ordering. |
VhalfInv |
Inverse square root of the working correlation matrix
in the original observation ordering. When both are non- |
initial_active_ineq |
Optional inequality columns used as the first active set in tuning or repeated solves. |
include_warnings |
Logical. |
verbose |
Logical. |
... |
Additional arguments passed to weight and dispersion functions. |
A list with elements:
List of K+1 coefficient vectors
(p_expansions \times 1), flat coefficients replicated across partitions.
List containing \mathbf{G},
\mathbf{G}^{1/2}, and
\mathbf{G}^{-1/2},
each a list of K+1
p_expansions \times p_expansions matrices,
or NULL if return_G_getB is FALSE.
\mathbf{G} satisfies
\mathbf{G}
=
\mathbf{G}^{1/2}
(\mathbf{G}^{1/2})^{\top} exactly.
When a correlation structure is present, these matrices are obtained from the dense whitened information matrix and then split back into per-partition blocks. When flat columns are present, the returned matrices also reflect the pooled flat-column information across partitions.
NULL when no inequality-refinement metadata are produced.
Otherwise a list of available QP or active-set diagnostics.
Dense SQP solves include solution, lagrangian,
active_constraints, iact, Amat_active,
bvec_active, meq_active, converged, and
final_deviance; non-GEE dense solves also carry
info_matrix, Amat_combined, bvec_combined,
and meq_combined. Partition-wise active-set refinement
returns the corresponding active-constraint summary only.
lgspline, lgspline.fit,
get_B
## Not run:
## Minimal verification example: Gaussian identity, no correlation,
## 2 partitions, 3 spline columns + 1 flat column per partition.
##
## This confirms that blockfit_solve returns compatible output and
## that flat coefficients are replicated across partitions.
set.seed(1234)
n1 <- 50; n2 <- 50
p_expansions <- 4 # intercept + x + x^2 + z (flat)
K <- 1 # 2 partitions
X1 <- cbind(1, rnorm(n1), rnorm(n1)^2, rnorm(n1))
X2 <- cbind(1, rnorm(n2), rnorm(n2)^2, rnorm(n2))
beta_true <- c(1, 0.5, -0.3, 0.8)
y1 <- X1 %*% beta_true + rnorm(n1, 0, 0.5)
y2 <- X2 %*% beta_true + rnorm(n2, 0, 0.5)
## Constraint: spline coefficients equal across partitions
## (columns 1:3 constrained, column 4 is flat)
A <- matrix(0, nrow = p_expansions * (K + 1), ncol = 3)
for(j in 1:3){
A[j, j] <- 1
A[p_expansions + j, j] <- -1
}
qr_A <- qr(A)
A <- qr.Q(qr_A)[, 1:qr_A$rank, drop = FALSE]
Lambda <- diag(p_expansions) * 1e-4
L_partition_list <- list(0, 0)
X_gram <- list(crossprod(X1), crossprod(X2))
G_list <- compute_G_eigen(
X_gram, Lambda, K,
parallel = FALSE, cl = NULL,
chunk_size = NULL, num_chunks = NULL, rem_chunks = NULL,
family = gaussian(),
unique_penalty_per_partition = FALSE,
L_partition_list = L_partition_list,
keep_G = TRUE,
schur_corrections = list(0, 0))
result <- blockfit_solve(
X = list(X1, X2),
y = list(y1, y2),
flat_cols = 4L,
K = K, p_expansions = p_expansions,
Lambda = Lambda,
L_partition_list = L_partition_list,
unique_penalty_per_partition = FALSE,
A = A, R_constraints = ncol(A),
constraint_values = list(),
X_gram = X_gram,
Ghalf_full = G_list$Ghalf,
GhalfInv_full = G_list$GhalfInv,
family = gaussian(),
order_list = list(1:n1, (n1+1):(n1+n2)),
glm_weight_function = function(mu, y, oi, fam, d, ow, ...) rep(1, length(y)),
schur_correction_function = function(X, y, B, d, ol, K, fam, ow, ...) {
lapply(1:(K + 1), function(k) 0)
},
need_dispersion_for_estimation = FALSE,
dispersion_function = function(...) 1,
observation_weights = list(rep(1, n1), rep(1, n2)),
homogenous_weights = TRUE,
iterate = TRUE, tol = 1e-8,
parallel = FALSE, cl = NULL,
chunk_size = NULL, num_chunks = NULL, rem_chunks = NULL,
return_G_getB = TRUE,
verbose = FALSE)
## Verify: flat coefficient (col 4) is identical across partitions
stopifnot(abs(result$B[[1]][4] - result$B[[2]][4]) < 1e-10)
## Verify: output structure
stopifnot(length(result$B) == K + 1)
stopifnot(length(result$B[[1]]) == p_expansions)
stopifnot(!is.null(result$G_list))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.