hfr | R Documentation |

HFR is a regularized regression estimator that decomposes a least squares regression along a supervised hierarchical graph, and shrinks the edges of the estimated graph to regularize parameters. The algorithm leads to group shrinkage in the regression parameters and a reduction in the effective model degrees of freedom.

hfr( x, y, weights = NULL, kappa = 1, q = NULL, intercept = TRUE, standardize = TRUE, partial_method = c("pairwise", "shrinkage"), ridge_lambda = 0, ... )

`x` |
Input matrix or data.frame, of dimension |

`y` |
Response variable. |

`weights` |
an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used for the level-specific regressions. |

`kappa` |
The target effective degrees of freedom of the regression as a percentage of |

`q` |
Thinning parameter representing the quantile cut-off (in terms of contributed variance) above which to consider levels in the hierarchy. This can used to reduce the number of levels in high-dimensional problems. Default is no thinning. |

`intercept` |
Should intercept be fitted. Default is |

`standardize` |
Logical flag for x variable standardization prior to fitting the model. The coefficients are always returned on the original scale. Default is |

`partial_method` |
Indicate whether to use pairwise partial correlations, or shrinkage partial correlations. |

`ridge_lambda` |
Optional penalty for level-specific regressions (useful in high-dimensional case) |

`...` |
Additional arguments passed to |

Shrinkage can be imposed by targeting an explicit effective degrees of freedom.
Setting the argument `kappa`

to a value between `0`

and `1`

controls
the effective degrees of freedom of the fitted object as a percentage of *p*.
When *p > N* `kappa`

is a percentage of *(N - 2)*.
If no `kappa`

is set, a linear regression with `kappa = 1`

is
estimated.

Hierarchical clustering is performed using `hclust`

. The default is set to
ward.D2 clustering but can be overridden by passing a method argument to `...`

.

For high-dimensional problems, the hierarchy becomes very large. Setting `q`

to a value below 1
reduces the number of levels used in the hierarchy. `q`

represents a quantile-cutoff of the amount of
variation contributed by the levels. The default (`q = NULL`

) considers all levels.

An 'hfr' regression object.

Johann Pfitzinger

Pfitzinger, J. (2022). Cluster Regularization via a Hierarchical Feature Regression. arXiv 2107.04831[statML]

`cv.hfr`

, `se.avg`

, `coef`

, `plot`

and `predict`

methods

x = matrix(rnorm(100 * 20), 100, 20) y = rnorm(100) fit = hfr(x, y, kappa = 0.5) coef(fit)

