cv.hierNest: Cross-validated hierarchical nested regularization for...

View source: R/cv_hierNest.R

cv.hierNestR Documentation

Cross-validated hierarchical nested regularization for subgroup models

Description

Fits regularization paths for hierarchical subgroup-specific penalized learning problems, leveraging nested group structure (such as Major Diagnostic Categories [MDC] and Diagnosis-Related Groups [DRG]) with options for lasso or overlapping group lasso penalties. Performs cross-validation to select tuning parameters for penalization and subgroup structure.

This function enables information sharing across related subgroups by reparameterizing covariate effects into overall, group-specific, and subgroup-specific components and supports structured shrinkage through hierarchical regularization. Users may select between the lasso (hierNest-Lasso) and overlapping group lasso (hierNest-OGLasso) frameworks, as described in Jiang et al. (2024, submitted, see details below).

Usage

cv.hierNest(
  x,
  y,
  group = NULL,
  family = c("gaussian", "binomial"),
  nlambda = 100,
  lambda.factor = NULL,
  pred.loss = c("default", "mse", "deviance", "mae", "misclass", "ROC"),
  lambda = NULL,
  pf_group = NULL,
  pf_sparse = NULL,
  intercept = FALSE,
  asparse1 = c(0.5, 20),
  asparse2 = c(0.01, 0.2),
  asparse1_num = 4,
  asparse2_num = 4,
  standardize = TRUE,
  lower_bnd = -Inf,
  upper_bnd = Inf,
  eps = 1e-08,
  maxit = 3e+06,
  hier_info = NULL,
  method = "overlapping",
  partition = "subgroup",
  cvmethod = "general"
)

Arguments

x

Matrix of predictors, of dimension n \times p; each row is an observation. Can be a dense or sparse matrix.

y

Response variable. For 'family="gaussian"', should be numeric. For 'family="binomial"', should be a factor with two levels or a numeric vector with two unique values.

group

Optional vector or factor indicating group assignments for variables. Used for custom grouping.

family

Character string specifying the model family. Options are '"gaussian"' (default) for least-squares regression, or '"binomial"' for logistic regression.

nlambda

Number of lambda values to use for regularization path. Default is 100.

lambda.factor

Factor determining the minimal value of lambda in the sequence, where 'min(lambda) = lambda.factor * max(lambda)'. See Details.

pred.loss

Character string indicating loss to minimize during cross-validation. Options include '"default"', '"mse"', '"deviance"', '"mae"', '"misclass"', and '"ROC"'.

lambda

Optional user-supplied sequence of lambda values (overrides 'nlambda'/'lambda.factor').

pf_group

Optional penalty factors on the groups, as a numeric vector. Default adjusts for group size.

pf_sparse

Optional penalty factors on the l1-norm (for sparsity), as a numeric vector.

intercept

Logical; whether to include an intercept in the model. Default is TRUE.

asparse1

Relative weight(s) for the first (e.g., group) layer of the overlapping group lasso penalty. Default is c(0.5, 20).

asparse2

Relative weight(s) for the second (e.g., subgroup) layer. Default is c(0.01, 0.2).

asparse1_num

Number of values in asparse1 grid (for grid search). Default is 4.

asparse2_num

Number of values in asparse2 grid (for grid search). Default is 4.

standardize

Logical; whether to standardize predictors prior to model fitting. Default is TRUE.

lower_bnd

Lower bound(s) for coefficient values. Default is -Inf.

upper_bnd

Upper bound(s) for coefficient values. Default is Inf.

eps

Convergence tolerance for optimization. Default is 1e-8.

maxit

Maximum number of optimization iterations. Default is 3e6.

hier_info

Required for 'method = "overlapping"'; a matrix describing the hierarchical structure of the subgroups (see Details).

method

Character; either '"overlapping"' for overlapping group lasso, '"sparsegl"' for sparse group lasso, or '"general"' for other hierarchical regularization. Default is '"overlapping"'.

partition

Character string; determines subgroup partitioning. Default is '"subgroup"'.

cvmethod

Cross-validation method. Options include '"general"' (default), '"grid_search"', or '"user_supply"' (for user-supplied grid).

Details

The hierarchical nested framework decomposes covariate effects into overall, group, and subgroup-specific components, with regularization encouraging fusion or sparsity across these hierarchical levels. The function can fit both the lasso penalty (allowing arbitrary zero/non-zero patterns) and the overlapping group lasso penalty (enforcing hierarchical selection structure), as described in Jiang et al. (2024, submitted).

The argument 'hier_info' must be supplied for '"overlapping"' method, and encodes the hierarchical relationship between groups and subgroups (e.g., MDCs and DRGs).

Cross-validation is used to select tuning parameters, optionally over a grid for hierarchical penalty weights (asparse1, asparse2), and the regularization parameter lambda.

Value

An object containing the fitted hierarchical model and cross-validation results, including:

fit

Fitted model object.

lambda

Sequence of lambda values considered.

cv_error

Cross-validation error/loss for each combination of tuning parameters.

best_params

Best tuning parameters selected.

...

Additional diagnostic and output fields.

References

Jiang, Z., Huo, L., Hou, J., & Huling, J. D. "Heterogeneous readmission prediction with hierarchical effect decomposition and regularization".


hierNest documentation built on March 24, 2026, 5:07 p.m.