hfr: Fit a hierarchical feature regression

View source: R/hfr.R

hfrR Documentation

Fit a hierarchical feature regression


HFR is a regularized regression estimator that decomposes a least squares regression along a supervised hierarchical graph, and shrinks the edges of the estimated graph to regularize parameters. The algorithm leads to group shrinkage in the regression parameters and a reduction in the effective model degrees of freedom.


  weights = NULL,
  kappa = 1,
  q = NULL,
  intercept = TRUE,
  standardize = TRUE,
  partial_method = c("pairwise", "shrinkage"),
  ridge_lambda = 0,



Input matrix or data.frame, of dimension (N x p); each row is an observation vector.


Response variable.


an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used for the level-specific regressions.


The target effective degrees of freedom of the regression as a percentage of p.


Thinning parameter representing the quantile cut-off (in terms of contributed variance) above which to consider levels in the hierarchy. This can used to reduce the number of levels in high-dimensional problems. Default is no thinning.


Should intercept be fitted. Default is intercept=TRUE.


Logical flag for x variable standardization prior to fitting the model. The coefficients are always returned on the original scale. Default is standardize=TRUE.


Indicate whether to use pairwise partial correlations, or shrinkage partial correlations.


Optional penalty for level-specific regressions (useful in high-dimensional case)


Additional arguments passed to hclust.


Shrinkage can be imposed by targeting an explicit effective degrees of freedom. Setting the argument kappa to a value between 0 and 1 controls the effective degrees of freedom of the fitted object as a percentage of p. When p > N kappa is a percentage of (N - 2). If no kappa is set, a linear regression with kappa = 1 is estimated.

Hierarchical clustering is performed using hclust. The default is set to ward.D2 clustering but can be overridden by passing a method argument to ....

For high-dimensional problems, the hierarchy becomes very large. Setting q to a value below 1 reduces the number of levels used in the hierarchy. q represents a quantile-cutoff of the amount of variation contributed by the levels. The default (q = NULL) considers all levels.


An 'hfr' regression object.


Johann Pfitzinger


Pfitzinger, J. (2022). Cluster Regularization via a Hierarchical Feature Regression. arXiv 2107.04831[statML]

See Also

cv.hfr, se.avg, coef, plot and predict methods


x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit = hfr(x, y, kappa = 0.5)

hfr documentation built on Jan. 22, 2023, 1:46 a.m.

Related to hfr in hfr...